Skip to main content

Big data, medicines safety and pharmacovigilance


Since the 1990s, the concept of big data has emerged as more relevant, diverse and larger data sets, responsible for the introduction of new drug developments, improved clinical practices and healthcare financing in the healthcare industry [1]. For big data analysis, one can handle a large pool of digital medical records or administrative data including drug safety reports, drug prescriptions as well as hospital discharge datasets [2].

Many rare adverse effects remain undetected due to a limited number of sampled individuals in a clinical trial; hence, it is necessary to monitor the drugs even after their release into the market. In this context, “pharmacovigilance” helps to collect, analyze, and disseminate adverse drug reaction reports collected during the post-marketing phase [3, 4].

Data mining from drug safety report databases and medical literature is a time-consuming task; however, with the digital revolution, the researchers are exploring if the potential of big data could be used to study and monitor drug safety. In many developed countries, drug safety surveillance based on databases through automation is becoming increasingly common [2]. This involves the usage of electronic methods to systematically analyze the large volume of information. This could be further helpful to detect data patterns to identify new adverse drug reactions, which are otherwise not available through normal screening [2]. This commentary discusses big data, artificial intelligence and the use of social media. It also elaborates, how “big data” feeds into evaluating the safety of new and orphan medicines (Fig. 1).

Fig. 1
figure 1

A framework showing the possible linkages between big data and pharmacovigilance

Artificial intelligence and pharmacovigilance

To better understand the use of artificial intelligence in pharmacovigilance, it may be useful to define this in terms of methods, tasks and data sets [5]. Machine learning is part of artificial intelligence that deals with the ability of machines to learn without having human input. Due to improved computational techniques and the availability of larger datasets, there is an increasing trend in machine learning adoption in healthcare [6].

For an automated signal generation in pharmacovigilance, both supervised and unsupervised machine learning approaches are used. The unsupervised machine learning approach employs the identification of drug safety signals as well as explores the pattern of drug utilization. While in supervised machine learning, the computer is provided with a set of instructions to produce an algorithm based on the desired output [7]. It could be explained by considering the identification of an ADR from free text [8]. This is done by creating an identification pattern extracting information from the medical records and then applying the algorithms to the full electronic medication records. The process is called natural language processing (NLP). It can be applied to identify drug interactions from clinical notes and to find the association between drugs and potential ADRs [9].

Social media

With the increasing use of social media, it is becoming a very useful tool to promote pharmacovigilance. However, several regulatory and technical challenges need to be addressed before the true potential of social media could be explored. The data can be collected from either Twitter or Facebook where several patients share their personal experiences regarding a particular drug or therapy, thus providing a good source for early signal detection [10]. However, the challenge is the accuracy of the information being posted on these websites. Several methods are in place to cross-check the reliability of data. One such tool is to adapt "Fuzzy Formal Concept Analysis" as it verifies the data by checking the information with the official information sources [11]. Social media could be very useful particularly in low- and middle-income (LMICs) countries where it is difficult to obtain accurate electronic data and large populations have started using Twitter and Facebook.

Orphan drugs

In the past, the treatment for rare diseases was a challenge. There was little interest to develop new medicines for these diseases due to little market incentives. To overcome this problem, in the United States, several initiatives were taken including the United States drug act of 1983, the Rare Diseases Act of 2002, the Precision Medicine Initiative, and the 2016 Orphan Products Natural History Grants Program. As a result, the number of orphan medicines increased, and by 2016, 3735 products were registered as orphan drugs in the US. Also, 1314 medicines were registered in Europe [12]. The number of people who are using orphan drugs is very small, hence conducting pharmacovigilance is a challenge. However, to solve these issues, in some countries, patient support programs (PSPs) are established. The purpose is to create awareness about orphan diseases and medicines use. It is expected that these programs may help to produce drug safety reports too [13].

COVID-19 and pharmacovigilance

In the era of COVID-19, medicines usage and pharmacovigilance are transforming rapidly, and a large volume of data is generated. The analysis of such a big volume of data requires both the involvement of artificial intelligence and big data analytical techniques. This is also creating opportunities for researchers and healthcare professionals to map innovative solutions during COVID times [14,15,16]. The pandemic is also resulting in large investments and the focus is on much needed infrastructure to monitor vaccine safety. It has also resulted in renewed interest in this area in a spectrum of middle and high-income countries.


This commentary sets out the scene with regard to big data, medicines safety, and pharmacovigilance. It narrates how access to big data can improve medicines' safety. The framework describes the influence of social media and artificial intelligence on big data analytics. It also explained how this feeds into evaluating orphan and new medicines especially vaccines in the COVID-19 context.



United States


Natural language processing


Patient support programs


Low- and middle-income countries


  1. Bate A, Reynolds RF, Caubel P. The hope, hype and reality of Big Data for pharmacovigilance. Ther Adv Drug Saf. 2018;9:5–11.

    Article  Google Scholar 

  2. Ventola CL. Big data and pharmacovigilance: data mining for adverse drug events and interactions. Pharm Ther. 2018;43(6):340.

    Google Scholar 

  3. Hussain R, Hassali MA. Current status and future prospects of pharmacovigilance in Pakistan. J Pharm Policy Prac. 2019;12(1):1–3.

    Article  CAS  Google Scholar 

  4. Hussain R, Hassali MA, Babar ZUD. Medicines safety in the globalized context. In: Global pharmaceutical policy. Singapore: Palgrave Macmillan; 2020. p. 1–28.

    Google Scholar 

  5. Hauben M, Hartford CG. Artificial intelligence in pharmacovigilance: scoping points to consider. Clin Ther. 2021.

    Article  PubMed  Google Scholar 

  6. Keane PA, Topol EJ. AI-facilitated health care requires education of clinicians. Lancet. 2021;397(10281):1254.

    Article  Google Scholar 

  7. Harpaz R, DuMouchel W, Shah NH, Madigan D, Ryan P, Friedman C. Novel data-mining methodologies for adverse drug event discovery and analysis. Clin Pharmacol Ther. 2012;91(6):1010–21.

    Article  CAS  Google Scholar 

  8. Reps JM, Garibaldi JM, Aickelin U, Gibson JE, Hubbard RB. A supervised adverse drug reaction signalling framework imitating Bradford Hill’s causality considerations. J Biomed Inform. 2015;56:356–68.

    Article  Google Scholar 

  9. Nikfarjam A, Sarker A, O’Connor K, Ginn R, Gonzalez G. Pharmacovigilance from social media: mining adverse drug reaction mentions using sequence labeling with word embedding cluster features. J Am Med Inform Assoc. 2015;22(3):671–81.

    Article  Google Scholar 

  10. Sarker A, Gonzalez G. Portable automatic text classification for adverse drug reaction detection via multi-corpus training. J Biomed Inform. 2015;53:196–207.

    Article  Google Scholar 

  11. De Rosa M, Fenza G, Gallo A, Gallo M, Loia V. Pharmacovigilance in the era of social media: discovering adverse drug events cross-relating Twitter and PubMed. Future Gen Comput Syst. 2021;114:394–402. ISSN 0167-739X.

  12. Joppi R, Bertele V, Garattini S. Orphan drugs, orphan diseases. The first decade of orphan drug legislation in the EU. Eur J Clin Pharmacol. 2013;69(1009–1024):11.

    Google Scholar 

  13. Price J. What can big data offer the pharmacovigilance of orphan drugs? Clin Ther. 2016;38(12):2533–45.

    Article  Google Scholar 

  14. Li J. Unexpected opportunities for innovators in the post COVID world. Posted May 2, 2020. Accessed 22 Dec 2020.

  15. Royster K. Friedman lab finds unexpected opportunities in COVID adjusted research activities. Posted November 30, 2020. Accessed 24 Dec 2020.

  16. Beninger P. Influence of COVID-19 on the pharmacovigilance workforce of the future. Clin Ther. 2021.

    Article  PubMed  PubMed Central  Google Scholar 

Download references



Author information

Authors and Affiliations



Author read and approved the final manuscript.

Corresponding author

Correspondence to Rabia Hussain.

Ethics declarations

Ethics approval and consent to participate


Competing interests


Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hussain, R. Big data, medicines safety and pharmacovigilance. J of Pharm Policy and Pract 14, 48 (2021).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: