MALARIA
There are many real-time data streams that can impact Malaria. Data streams with precision geographical position allows country-based models to be applied locally, increasing the accuracy of diagnosis, and targeting resources where they will have the most impact.
Data streams with greater temporal update rates can enable a new paradigm in malaria modeling; a shift from off-line analysis of historical data to real-time model parameter estimation and the updating of parameters of machine-guided diagnostic tools on a daily basis.
Detailed measures such as objective RDT reading enables, for example, the objective measurement of symptomatic cases for more accurate model parameter estimation. Real-time updates of prevalence rate enables, for example, modeling and the development of interventions for low-parasite-count, asymptomatic cases that form a large reservoir for human transmission. It can also increase diagnostic sensitivity and improve the accuracy of prevalence estimation, e.g when implemented in a maternal workflow that has been shown to provide precise local prevalence rates in unstructured environments due to accurate denominator count [Hellewell et al.][van Eijk et al.].
Updates in socioeconomic factors enable, for example, more accurate probability-based malaria risk-assessment based on housing type and other factors, and individually-targeted interventions such as LLI net distribution with minimal wastage. In a final example, real-time updates in environmental factors, for example using standing-water detection from satellite imagery, enables more accurate modeling of the expected malaria cycle, and more accurate prediction of malaria outbreaks.
In partnership with ThinkMD, Northeastern University and Dr. Samuel Scarpino, a feasibility test of the potential impact on malaria diagnosis from real-time data streams was performed from data from 450 patients in Nigeria just using real-time symptom and RDT data with geographical and temporal precision.
Symptoms, context and objective measurements were treated as longitudinal temporal data points across 5 neighboring clinics, with the goal of improving the accuracy of diagnosis just based on symptoms and context alone. A diagnostic test (RDT) was used as an approximation to ground truth for machine learning training. An innovative incremental machine learning classifier was continually updated based on previous data points to provide an updated classifier that is responsive to changes in context on a very granular temporal and geographical basis.
Diagnostic accuracy was approximately 20% higher using the machine-learning / context-based approach compared to a traditional machine-guided diagnosis approach. The expectation is that adding even more contextual information and more sophisticated models can improve diagnostic accuracies even further.