Between 17 and 24 million people worldwide suffer from chronic fatigue syndrome, a deeply debilitating and difficult-to-diagnose condition. Also known as myalgic encephalomyelitis, according to the World Health Organization, this condition causes a wide range of symptoms that combine to produce a debilitating, difficult-to-explain feeling of extreme, chronic fatigue, including difficulty sleeping and feeling unwell after exertion. Some patients may have serious problems carrying out their usual activities or concentrating, and may even become bedridden.
"With such a wide range of symptoms that can worsen over time, the difficulty in making a diagnosis lies in the lack of diagnostic tests and biomarkers to define the affected patient," said Marcos Lacasa, a researcher currently working on his PhD thesis on the Universitat Oberta de Catalunya's (UOC) doctoral programme in Bioinformatics. "Diagnosis depends on the history and the doctor. An early diagnosis can have a big impact on the course of the disease."
In his latest paper, co-authored with his thesis supervisors Jordi Casas, from the UOC's Applied Data Science Lab, and José Alegre, from the Vall d'Hebron Institute of Research (VHIR), alongside Ferran Prados, also a researcher at the Applied Data Science Lab, Lacasa analyses how machine learning, a type of artificial intelligence (AI), can provide a better understanding of the disease and improve diagnosis. The paper has been published in Nature's open access Scientific Reports journal.
AI and synthetic patients
In the absence of clear biomarkers, there are currently no tests to diagnose whether someone has chronic fatigue syndrome or not. Although a great deal of research has been done in this area (the same group of researchers suggest in another recent article that patients' oxygen consumption levels should be used as a reference), diagnoses are primarily made on the basis of questionnaires that assess a person's perception of their fatigue. These questionnaires, such as the 36-Item Short Form Health Survey (SF-36), are well-defined and standardized. However, early diagnosis is still difficult.
What we have shown is that we can simulate a patient's condition in different areas based on their answers to a questionnaire. In other words, we could provide non-specialists with a machine learning application that could even predict a patient's performance on a stress test based on about forty questions. This would act as a warning of symptoms that could be associated with myalgic encephalomyelitis and would expedite the referral of the patient to the nearest specialized unit. In short, it would make early diagnosis more feasible."
Marcos Lacasa, Researcher, Universitat Oberta de Catalunya
The main challenge with this approach is having enough quality data to train the AI algorithm, so that it can then predict answers. "The application can provide AI-generated answers. A patient would not have to fill in six different questionnaires for us to know their overall condition. By filling in just one, the AI would fill in the rest," added Lacasa.
The solution proposed in the paper is to create what the researchers call synthetic patients. This approach allows data from a single general questionnaire to be used to fill in specialized questionnaires, or even to replace missing data. "We can carry out scientific studies using data that are quote-unquote made up by AI, but retain statistical characteristics as if they were real patients. The main advantage is that these synthetic data can be shared without fear of compromising private data of any kind."
In search of a treatment for chronic fatigue syndrome
The model proposed by the UOC and VHIR researchers has advantages, but also limitations. "Misuse of the synthetic data would invalidate the analyses. Likewise, it's still necessary to have real input data, such as those provided by the SF-36 questionnaire," said Lacasa. The advantages lie in having a tool that can provide high-fidelity synthetic data for research and educational purposes, free from legal, privacy, security and intellectual property restrictions.
In addition to improving diagnosis through questionnaires, other parallel lines of research into chronic fatigue syndrome are also being pursued. The search for biological markers that can be used to develop effective diagnostic tests is high on the list of priorities, along with the development of treatments. There is currently no cure. Instead, treatments are aimed at relieving symptoms through sleep hygiene, dietary changes, exercise, therapies and medications that target the predominant symptoms.
"What we would need is more funding to do genetic sequencing on patients with myalgic encephalomyelitis. Then we could do a genomic analysis and find out whether there is a protein that causes the disease. This would make it much easier to design an effective drug to alleviate the symptoms," Lacasa concluded.
- Lacasa, M., et al. (2023). A synthetic data generation system for myalgic encephalomyelitis/chronic fatigue syndrome questionnaires. Scientific Reports. doi.org/10.1038/s41598-023-40364-6.
- Lacasa, M., et al. (2023). Unsupervised cluster analysis reveals distinct subtypes of ME/CFS patients based on peak oxygen consumption and SF-36 scores. Clinical Therapeutics. doi.org/10.1016/j.clinthera.2023.09.007.