Living risk prediction algorithm (QCOVID) for risk of hospital admission and mortality from coronavirus 19 in adults: national derivation and validation cohort study
Clift AK., Coupland CAC., Keogh RH., Diaz-Ordaz K., Williamson E., Harrison EM., Hayward A., Hemingway H., Horby P., Mehta N., Benger J., Khunti K., Spiegelhalter D., Sheikh A., Valabhji J., Lyons RA., Robson J., Semple MG., Kee F., Johnson P., Jebb S., Williams T., Hippisley-Cox J.
<jats:title>Abstract</jats:title> <jats:sec> <jats:title>Objective</jats:title> <jats:p>To derive and validate a risk prediction algorithm to estimate hospital admission and mortality outcomes from coronavirus disease 2019 (covid-19) in adults.</jats:p> </jats:sec> <jats:sec> <jats:title>Design</jats:title> <jats:p>Population based cohort study.</jats:p> </jats:sec> <jats:sec> <jats:title>Setting and participants</jats:title> <jats:p>QResearch database, comprising 1205 general practices in England with linkage to covid-19 test results, Hospital Episode Statistics, and death registry data. 6.08 million adults aged 19-100 years were included in the derivation dataset and 2.17 million in the validation dataset. The derivation and first validation cohort period was 24 January 2020 to 30 April 2020. The second temporal validation cohort covered the period 1 May 2020 to 30 June 2020.</jats:p> </jats:sec> <jats:sec> <jats:title>Main outcome measures</jats:title> <jats:p>The primary outcome was time to death from covid-19, defined as death due to confirmed or suspected covid-19 as per the death certification or death occurring in a person with confirmed severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection in the period 24 January to 30 April 2020. The secondary outcome was time to hospital admission with confirmed SARS-CoV-2 infection. Models were fitted in the derivation cohort to derive risk equations using a range of predictor variables. Performance, including measures of discrimination and calibration, was evaluated in each validation time period.</jats:p> </jats:sec> <jats:sec> <jats:title>Results</jats:title> <jats:p> 4384 deaths from covid-19 occurred in the derivation cohort during follow-up and 1722 in the first validation cohort period and 621 in the second validation cohort period. The final risk algorithms included age, ethnicity, deprivation, body mass index, and a range of comorbidities. The algorithm had good calibration in the first validation cohort. For deaths from covid-19 in men, it explained 73.1% (95% confidence interval 71.9% to 74.3%) of the variation in time to death (R <jats:sup>2</jats:sup> ); the D statistic was 3.37 (95% confidence interval 3.27 to 3.47), and Harrell’s C was 0.928 (0.919 to 0.938). Similar results were obtained for women, for both outcomes, and in both time periods. In the top 5% of patients with the highest predicted risks of death, the sensitivity for identifying deaths within 97 days was 75.7%. People in the top 20% of predicted risk of death accounted for 94% of all deaths from covid-19. </jats:p> </jats:sec> <jats:sec> <jats:title>Conclusion</jats:title> <jats:p>The QCOVID population based risk algorithm performed well, showing very high levels of discrimination for deaths and hospital admissions due to covid-19. The absolute risks presented, however, will change over time in line with the prevailing SARS-C0V-2 infection rate and the extent of social distancing measures in place, so they should be interpreted with caution. The model can be recalibrated for different time periods, however, and has the potential to be dynamically updated as the pandemic evolves.</jats:p> </jats:sec>