COVID-19 trends identified from multiple linked EHR

It is important to know how coronavirus disease 2019 (COVID-19) is progressing to understand how public health interventions affect current trends. A new study looks at ten phenotypes of COVID-19 using linked electronic health records (EHR) on a nationwide scale, using a framework that can be extended or adapted for other research questions.  

Study: Understanding COVID-19 trajectories from a nationwide linked electronic health record cohort of 56 million people: phenotypes, severity, waves & vaccination. Image Credit: ETAJOE/ ShutterstockStudy: Understanding COVID-19 trajectories from a nationwide linked electronic health record cohort of 56 million people: phenotypes, severity, waves & vaccination. Image Credit: ETAJOE/ Shutterstock


Many earlier studies have shown the role of multiple factors that are linked with severe COVID-19 infection, while others describe prognostic markers for hospitalized patients. Most of these papers looked at one or a very few areas, such as one level of healthcare, examined small groups, or used binary outcomes, thus failing to consider the variability of the symptomatology of this infection.

To study COVID-19 severity by the outcome over time, considering the vaccination status simultaneously, and describing the patient's clinical course, the current study used data from linked nationwide EHR data.

In this study, which appears as a preprint on the medRxiv* server, the sociodemographic factors, occurrence of health conditions, the effect of the first and second waves in England, and vaccination effects, were considered separately.

They also looked at the most important pathways of COVID-19 in the healthcare system, besides setting out a set of reproducible algorithms that would help identify the patient's disease stage.

What did the study show?

The researchers identified approximately 3.5 million infected individuals (making up 6.1%) with ~8.8 million recorded COVID-19 phenotypes. Almost 90% had mild to moderate disease or asymptomatic infection.

Over one in ten were hospitalized. Severity phenotypes included intensive care unit admissions (10%), non-invasive ventilation (15%), and invasive ventilation (6%), while 4% died. Almost % received both invasive and non-invasive ventilation, with a small number (n=550) were on extracorporeal membrane oxygenation (ECMO).

The findings showed that 72% of deaths were among hospitalized patients. Of these, 30% of non-ICU patients died in the first wave, vs. 23% in the second wave. If all hospitalized patients were considered, the mortality was almost one-third and over one-fourth, respectively. There was no change in deaths among ICU patients.

About half of critically ill COVID-19 patients receiving non-ICU care died, at 46%, vs. approximately 0% of ICU admissions and 25% of non-critical hospitalized patients.

About a tenth of these deaths occurred within 30 days of a positive test for the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Still, the disease was not mentioned in the diagnosis or death certificate. These were less likely to have a positive test, dying of dementia, pneumonia, lung cancer most commonly (17% of the total).

About 7% of cases were identified only from death data, apparently without a recorded positive test or diagnosis before death supervened. These were mostly elderly, over 70 years, and almost 90% of them Whites with several co-existing illnesses – at a median of seven other medical conditions.

The disease trajectories were longer in the second wave compared to the first. Overall, more than half the deaths occurred in males; over 80% were over 70 years, and a quarter were in the most deprived quintile. Fatal cases had a median of eight comorbidities vs. one in all cases.

Surprisingly, this same fraction of infections occurred in the most deprived fifth. Over 55% of infections were in females, while almost one in seven was in Asian-origin patients. Among four million high-risk patients, almost one in ten were infected, and these patients showed 11% mortality.

Across the pandemic's course, survival time lengthened while the mortality failed to drop.

Multiple sources had to be tapped to capture all diagnoses, including primary care records where half the patients were first diagnosed, test records alone for over a quarter, a fifth with only a primary or secondary health record but no test record, and 7% with only death records.

What are the implications?

This study used a variety of data sources at various levels of healthcare and regions to capture many different phenotypes and exposures as well as disease outcomes. In addition to being among the most comprehensive, it is the largest in its sample size and database. This helped clarify and identify several COVID-19 events, such as the highest mortality among ventilated non-ICU patients.

This design can be adapted for use in other countries to examine disease severity and monitor the pandemic. Importantly, they solved the challenge of matching information from different sources of data covering the same event. This includes the finding cited above, consistent with the overwhelming demand for health services during the first wave, causing critical patient care to spill over to other areas outside the ICU.  This epitomizes the value of linked data sources and stringent phenotypic criteria to exploit data to the maximum and leave as little out as possible.

The researchers identified discrete data sources on ventilated patients by treatment, classified pandemic deaths, and showed how patients transit from one COVID-19 event to another.

The longer median duration between a positive test and hospitalization, or death, in the second wave, could be attributed to more testing and earlier diagnosis and better management, but still showed inequalities persisting among the most deprived and non-White ethnic minorities.

This study thus provides "a means to identify and prioritize care pathways associated with adverse outcomes and highlight healthcare system 'touch points' which may act as tangible targets for intervention." The authors describe this as "unlocking the power of linked data to disentangle the progression of individuals through the healthcare system and disease states over time."

Similar studies could help understand, for instance, vaccine efficacy against existing and new variants by their impact on both patient and healthcare system-level outcomes. This, in turn, would shape public health policy on booster doses, for example, and influence national health and safety. This is the final achievement of the scientists – a "reproducible, extensible and repurposable means to generate national-scale data to support critical policy decision making."

*Important notice

medRxiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as conclusive, guide clinical practice/health-related behavior, or treated as established information.

Journal reference:
Dr. Liji Thomas

Written by

Dr. Liji Thomas

Dr. Liji Thomas is an OB-GYN, who graduated from the Government Medical College, University of Calicut, Kerala, in 2001. Liji practiced as a full-time consultant in obstetrics/gynecology in a private hospital for a few years following her graduation. She has counseled hundreds of patients facing issues from pregnancy-related problems and infertility, and has been in charge of over 2,000 deliveries, striving always to achieve a normal delivery rather than operative.


Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Thomas, Liji. (2021, November 11). COVID-19 trends identified from multiple linked EHR. News-Medical. Retrieved on January 28, 2022 from

  • MLA

    Thomas, Liji. "COVID-19 trends identified from multiple linked EHR". News-Medical. 28 January 2022. <>.

  • Chicago

    Thomas, Liji. "COVID-19 trends identified from multiple linked EHR". News-Medical. (accessed January 28, 2022).

  • Harvard

    Thomas, Liji. 2021. COVID-19 trends identified from multiple linked EHR. News-Medical, viewed 28 January 2022,


The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of News Medical.
You might also like... ×
SARS-CoV-2 Omicron variant found to display increased resilience to antiviral type I interferon response