big data from social media

Data from social media, search queries, and wearable devices create opportunities for big data analytics to be used for personal health management and population health


Health-related data is not only created from scientific research and clinical practice, but also from a variety of social media and personal smart devices. With the increasing popularity of social media platforms like Twitter, mobile health apps, and fitness wearables, data from these non-traditional sources offer important insights into health-related issues through big data analytics (BDA), and may be a way patients can add to their medical records in the future. The information generated from these sources can be classified into 4 types: quantified self-data, location based information, social networking data, and data from search queries.

Quantified Self-Data

Quantified self-data is composed of data from devices, sensors, and self-reporting. This type of data has positive implications to daily health because it enables users to make data-driven decisions regarding their lifestyle by being engaged in tracking behaviours like food consumption or physical activity. Glooko is an example of a device which monitors blood glucose levels and allows diabetics to optimize their treatment plan by integrating their food intake and lifestyle data using their smartphone. Quantified self-data provides richer and more detailed information on health risk factors, enables personal data collection over longer periods, and helps device companies to examine trends in their user’s lifestyle and health to improve their devices.

Location-based Information

Location-based information is data derived from global positioning systems (GPS) and open source mapping and visualization projects. BDA of this information provides insight into environmental and social determinants of health, and can be useful as a monitor for disease outbreaks, allergens, pollutants, or water quality near a specific location.

Social Networking Data

Twitter, Facebook, and health-related social networking sites have become key sources for health-related BDA. In a 2011 study on Twitter, it was reported that 8.5% of English-language tweets relate to illness, and 16.6% to health. Analytics of Twitter has been used to assess disease spread in real time such as the Influenza A H1N1 outbreak, to discuss non-emergency health care, and to facilitate crisis mapping during emergency situations like the Boston Marathon explosion. Other social networking sites like Facebook use BDA to monitor how patients use social media to discuss their concerns about their illness and gain ‘word on the street’ perceptions regarding health issues.

In contrast to Twitter and Facebook, health-related social networking sites like provides information on the spread of infectious disease through crowd surveillance, while the site allows users with similar experiences to compare treatments and symptoms through the sharing of personal health data.

Data from Search Queries

BDA mines search queries to help clinicians and epidemiologists conduct analysis on and between patient populations to help identify disease trends and improve population health. Google and Yahoo search engine queries have been found to be highly predictive of a wide range of population-level health behaviours, and have also been able to predict the epidemics of the flu, Dengue fever, seasonality of depression, and prevalence of smoking.


Hansen, MM, Miron-Shatz, T, Lau, AYS, & Paton, C. Big Data in Science and Healthcare: A Review of Recent Literature and Perspectives. Yearbook of Medical Informatics, p21-26, 2014.

Paul, MJ & Dredze, M. You Are What You Tweet: Analyzing Twitter for Public Health. Proceedings of the Fifth International AAAI Conference on Weblogs and Social Media, p265-272, 2011.

Big Data Analytics in Health White Paper by Canada Health Infoway:






Written by Fiona Wong, PhD

Facebook Comments