Background: Many individuals who will experience a first episode of psychosis (FEP) are not detected before occurrence, limiting the effect of preventive interventions. The combination of machine-learning methods and electronic health records (EHRs) could help address this gap. Methods: This case-control development and validation study is based on EHR data from IBM Explorys. The IBM Explorys Platform holds standardised, longitudinal, de-identified, patient-level EHR data pooled from different health-care systems with distinct EHRs. The present EHR-based studies were retrospective, matched (1:1), case-control studies compliant with RECORD, STROBE, and TRIPOD statements. The study included individuals in the IBM Explorys database who at some point between 1990 and 2018 had a diagnosis of FEP followed by schizophrenia, and psychosis-free matched control individuals from a random subsample of the full cohort. For every individual in the FEP cohort, the individual in the control cohort was matched to have a similar date for inclusion in the database and a similar total observation time. Individuals in the FEP cohort had their index date defined as the first diagnosis of psychosis or the first prescription of antipsychotic medication. Individuals in the control cohort had their index date defined to occur the same number of days after inclusion in the database as their matching FEP individual. The FEP and control cohorts were both randomly split into development and validation datasets in a ratio of 7:3. The subset of individuals in the validation dataset who had all their health-care encounters at providers that were not seen in the development dataset made up the external validation subset. A novel recurrent neural network model was developed to predict the risk of FEP 1 year before the index date by employing demographics and medical events (in the categories diagnoses, prescriptions, procedures, encounters and admissions, observations, and laboratory test results) dynamically collected in the EHR as part of clinical routine. We named the recurrent neural network Dynamic ElecTronic hEalth reCord deTection (DETECT). The main outcomes were accuracy and area under receiver operating characteristic curve (AUROC). Decision-curve analyses and dynamic patient journey plots were used to evaluate clinical usefulness. Findings: The FEP and control cohorts each comprised 72 860 individuals. 102 030 individuals (51 015 matching pairs) were randomly allocated to the development dataset and the remaining 43 690 to the validation dataset. In the validation dataset, 4770 individuals had all their encounters outside of the 118 790 health-care providers that were encountered in the development dataset. The data from these individuals made up the external validation subset. The median follow-up (observation time before index date) was 6·0 years (IQR 3·0–10·4). In the development dataset, DETECT's prognostic accuracy was 0·787 and AUROC was 0·868. In the validation dataset, DETECT's prognostic accuracy was 0·774 and AUROC was 0·856. In the external test subset, DETECT's balanced prognostic accuracy was 0·724 and AUROC was 0·799. Prevalence-adjusted decision-curve analyses suggested that DETECT was associated with a positive net benefit in two different scenarios for FEP detection. Interpretation: DETECT showed adequate prognostic accuracy to detect individuals at risk of developing a FEP in primary and secondary care. Replication and refinement in a population-based setting are needed to consolidate these findings. Funding: Lundbeck.

Dynamic ElecTronic hEalth reCord deTection (DETECT) of individuals at risk of a first episode of psychosis: a case-control development and validation study

Fusar Poli P.
2020-01-01

Abstract

Background: Many individuals who will experience a first episode of psychosis (FEP) are not detected before occurrence, limiting the effect of preventive interventions. The combination of machine-learning methods and electronic health records (EHRs) could help address this gap. Methods: This case-control development and validation study is based on EHR data from IBM Explorys. The IBM Explorys Platform holds standardised, longitudinal, de-identified, patient-level EHR data pooled from different health-care systems with distinct EHRs. The present EHR-based studies were retrospective, matched (1:1), case-control studies compliant with RECORD, STROBE, and TRIPOD statements. The study included individuals in the IBM Explorys database who at some point between 1990 and 2018 had a diagnosis of FEP followed by schizophrenia, and psychosis-free matched control individuals from a random subsample of the full cohort. For every individual in the FEP cohort, the individual in the control cohort was matched to have a similar date for inclusion in the database and a similar total observation time. Individuals in the FEP cohort had their index date defined as the first diagnosis of psychosis or the first prescription of antipsychotic medication. Individuals in the control cohort had their index date defined to occur the same number of days after inclusion in the database as their matching FEP individual. The FEP and control cohorts were both randomly split into development and validation datasets in a ratio of 7:3. The subset of individuals in the validation dataset who had all their health-care encounters at providers that were not seen in the development dataset made up the external validation subset. A novel recurrent neural network model was developed to predict the risk of FEP 1 year before the index date by employing demographics and medical events (in the categories diagnoses, prescriptions, procedures, encounters and admissions, observations, and laboratory test results) dynamically collected in the EHR as part of clinical routine. We named the recurrent neural network Dynamic ElecTronic hEalth reCord deTection (DETECT). The main outcomes were accuracy and area under receiver operating characteristic curve (AUROC). Decision-curve analyses and dynamic patient journey plots were used to evaluate clinical usefulness. Findings: The FEP and control cohorts each comprised 72 860 individuals. 102 030 individuals (51 015 matching pairs) were randomly allocated to the development dataset and the remaining 43 690 to the validation dataset. In the validation dataset, 4770 individuals had all their encounters outside of the 118 790 health-care providers that were encountered in the development dataset. The data from these individuals made up the external validation subset. The median follow-up (observation time before index date) was 6·0 years (IQR 3·0–10·4). In the development dataset, DETECT's prognostic accuracy was 0·787 and AUROC was 0·868. In the validation dataset, DETECT's prognostic accuracy was 0·774 and AUROC was 0·856. In the external test subset, DETECT's balanced prognostic accuracy was 0·724 and AUROC was 0·799. Prevalence-adjusted decision-curve analyses suggested that DETECT was associated with a positive net benefit in two different scenarios for FEP detection. Interpretation: DETECT showed adequate prognostic accuracy to detect individuals at risk of developing a FEP in primary and secondary care. Replication and refinement in a population-based setting are needed to consolidate these findings. Funding: Lundbeck.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11571/1361534
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 28
  • ???jsp.display-item.citation.isi??? 26
social impact