The COVID-19 disease, first found in China in December 2019, is still an ongoing pandemic. Since May-June 2020, using artificial intelligence (AI), Google (Alphabet Inc.) has been providing forecasts for the COVID-19 outbreak in the USA. Similarly, Japan has initiated such services since November 2020.
Junko Kurita et al. from Japan have compared Google AI forecasting with a statistical model by human intelligence and recently presented it in a preprint medRxiv* paper.
Epidemic curve, number of newly confirmed patients each day, and forecasting by Google AI and our statistical model. Image Credit:https://www.medrxiv.org/content/10.1101/2020.12.16.20248358v1.full.pdf
The authors regressed the number of patients whose onset date was day t on the number of patients whose past onset date was 14 days prior. They had information about traditional surveillance data for common pediatric infectious diseases, including influenza and prescription surveillance seven days prior. They predicted the number of onset patients for seven days, prospectively. Finally, they compared the result with Google’s AI-produced forecast.
In this study, they used the discrepancy rate to evaluate prediction precision: the sum of absolute differences between data and prediction divided by the aggregate of data.
The authors find that the Google prediction to be significantly negatively correlated with the actual observed data. However, the model used in this study is slightly correlated with the observed data, though not significant.
In absolute terms, the discrepancy rate of Google prediction was 27.7% for the first week, whereas the discrepancy rate of the model used in this study was only 3.47%.
It is noteworthy that this result is tentative: the epidemic curve showing newly onset patients was not fixed.”
In Japan, Google have started to provide similar services since November 2020. For all information about infectious diseases except for COVID-19, the authors used National Official Sentinel Surveillance for Infectious Diseases (NOSSID), and prescription surveillance (PS); which are available for more than ten years before the COVID-19 outbreak occurred.
In the system used by the authors in this study, the numbers of patients were estimated from the numbers of prescriptions for neuraminidase inhibitors, anti-varicella-herpes-zoster virus (VZV) drugs, antibiotic drugs, antipyretic analgesics, and multi-ingredient cold medications by prefecture each day. The antibiotics were classified into five types: penicillin, cephem, macrolide, new quinolone, and others.
These drugs were chosen to identify clusters of rash, fever, or digestive symptoms to detect bioterrorism attacks, emerging diseases, and mass food poisoning. Soon after acquiring the data, it is presented on a web page: http://prescription.orca.med.or.jp/syndromic/kanjyasuikei/
They evaluated two models' predictive capability by the discrepancy rate and correlation coefficient among predictions from the data.
The authors have presented the observed epidemic curve from the end of November and their prediction from November 20, 2020. They also showed the number of newly confirmed cases and Google’s prediction for that number.
They show that Google has a significant negative correlation, and their model is positively correlated but insignificant. Results show that the model was superior to that of the Google AI prediction in terms of the discrepancy rate and correlation rate, the authors write.
While correlation coefficients are for evaluation, they are insufficient to evaluate the gap separating data and prediction. It simply indicates whether the data are proportional or not. Keeping this in mind, the authors adopted the discrepancy rate for evaluation of prediction.
The Google AI prediction depends on the mathematical model. Therefore, it probably cannot explain several peaks of the COVID-19 outbreak. Mathematical models imply that the peak will be achieved by herd immunity when the proportion of the infected persons is higher than 1-1/R0.
On the forecasting study done in Japan by Google, the details were not disclosed. The COVID-19 peaks were slowing down or stable during the period and Japan. Therefore, the authors believe that any model can probably predict outcomes easily. This study evaluates the Google AI forecasting model for its prediction power. Conversely, this study's examined statistical model can explain the second peak around the end of July - predicting the epidemic model.
The authors conclude that AI may not predict better than human intelligence, especially in unusual and challenging times, such as the current COVID-19 pandemic. They demonstrate here that their model is more appropriate than Google, as evident in the first-week study.
medRxiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as conclusive, guide clinical practice/health-related behavior, or treated as established information.
- Interim evaluation of Google AI forecasting for COVID−19 compared with statistical forecasting by human intelligence in the first week. Junko Kurita, Tamie Sugawara, Yasushi Ohkusa medRxiv 2020.12.16.20248358; https://www.medrxiv.org/content/10.1101/2020.12.16.20248358v1