Tracking Cholera in Haiti: Social Media and Data Mining


Home / Tracking Cholera in Haiti: Social Media and Data Mining

Interview with Dr. Rumi Chunara… Continued

Physician Megan Coffee in Haiti Using HealthMap: Photo by Paul Sebring)

Decoded Science: Technology currently has the capability to data-mine after the fact, to find information about the spread of disease – do you foresee these principles being used in a program to identify the spread of disease in a real-time fashion, so that health workers can reduce the severity of outbreaks that are already in progress?

Dr. Chunara:Yes, a benefit of this data is that it is available in real-time, so it could be used in a program in real-time, so that response measures can be focused and timely. Additionally, it would be interesting to consider how this can be done prospectively.

Decoded Science: What do you consider to be the most significant aspect of this research?

Dr. Chunara: We hope that this can inspire further study about how to use novel forms of surveillance. 

What is Data Mining?

Data mining is a method of sorting and assigning values to bits of information. Imagine each Tweet during a given time period as a child’s block; each block has different sides of different colors according to the content of the Tweet. If you create a data mining program that will pick out all of the blocks that have red on them, or Tweets that contain information that pertains to certain aspects of your chosen subject based on keywords, you can then ask the program to sort the Tweets based on the amount of red present on the block, or the number of faces which contain red, or any other criteria that you devise.

It may be easy for us to imagine physically sorting through piles of blocks, but the vast amount of data present on the Internet is so huge that a computer is absolutely necessary for a timely result.

Tracking the Spread of Disease with Computers

Although the current technology limits us to a real-time view of spreading disease, it’s entirely possible that advanced data mining models have the capability of suggesting the future spread of outbreaks such as cholera. If a data mining program can identify the characteristics of an area that is likely to yield fossils, why not a data mining program to identify characteristics of areas or populations that are vulnerable to diseases such as Cholera? Only time will tell.


Chunara, R., Andrews, J. R., Brownstein, J. S. Social and news media enable estimation of epidemiological patterns early in the 2010 Haitian cholera outbreak. (2012). American Journal of Tropical Medicine and Hygiene. Accessed January 9, 2012.

HealthMap. About HealthMap. (2012). Accessed January 9, 2012.

Click to Return to Page One: Tracking Cholera in Haiti with Social Media

Leave a Comment