Environment, vector, or host? Using machine learning to untangle the mechanisms driving arbovirus outbreaks
Abstract: Climatic, landscape and host features are critical components in shaping outbreaks of vector-borne diseases. However, the relationship between the outbreaks of vector-borne pathogens and their environmental drivers is typically complicated, nonlinear and may vary by taxonomic units below the species level (e.g., strain or serotype).
Here, we aim to untangle how these complex forces shape the risk of outbreaks of Bluetongue virus (BTV); a vector-borne pathogen that is continuously emerging and re-emerging across Europe, with severe economic implications. We tested if the ecological predictors of BTV outbreak risk were serotype-specific by examining the most prevalent serotypes recorded in Europe (1, 4, and 8). We used a robust machine learning (ML) pipeline and 23 relevant environmental features to fit predictive models to 24,245 outbreaks reported in 25 European countries between 2000 and 2019. Our ML models demonstrated high predictive performance for all BTV serotypes (Accuracies > 0.87) and revealed strong nonlinear relationships between BTV outbreak risk and environmental and host features. Serotype-specific analysis suggests, however, that each of the major serotypes (1, 4, and 8) had a unique outbreak risk profile. For example, temperature and midge abundance were as the most important characteristics shaping serotype 1, whereas for serotype 4 goat density and temperature were more important. We were also able to identify strong interactive effects between environmental and host characteristics that were also serotype-specific. Our ML pipeline was able to reveal more in-depth insights into the complex epidemiology of BTVs and can guide policymakers in intervention strategies to help reduce the economic implications and social cost of this important pathogen.
Alkhamis MA, Fountain-Jones NM, Aguilar-Vega C, and Sánchez-Vizcaíno JM.