MAMC Journal of Medical Sciences

: 2019  |  Volume : 5  |  Issue : 3  |  Page : 103--104

Technology, Healthcare, and Big Data Analytics

Rajeew Prabhat Tiwari1, Girdhar Verma2, Sukanya Ghildiyal2, Syed Shariq Naeem3,  
1 Real World Evidence, Gurgaon, Haryana, India
2 THB: Technology,Healthcare,Big Data Analytics, Gurgaon, Haryana, India
3 Department of Pharmacology, Jawaharlal Nehru Medical College, Aligarh Muslim University, Aligarh, Uttar Pradesh, India

Correspondence Address:
Dr. Rajeew Prabhat Tiwari
Associate Vice President, Real World Evidence, THB: Technology,Healthcare,Big Data Analytics, 7 (2 bays) Urban Estate, Sector 32, Gurgaon, 122001, Haryana

How to cite this article:
Tiwari RP, Verma G, Ghildiyal S, Naeem SS. Technology, Healthcare, and Big Data Analytics.MAMC J Med Sci 2019;5:103-104

How to cite this URL:
Tiwari RP, Verma G, Ghildiyal S, Naeem SS. Technology, Healthcare, and Big Data Analytics. MAMC J Med Sci [serial online] 2019 [cited 2020 Feb 25 ];5:103-104
Available from:

Full Text

Big data analytics is analysis of massive data sets. The big data has value (benefits for healthcare stakeholders), volume (quantity in terabytes, petabytes, and yottabytes), velocity (constant data addition, processing, and analysis), variety (complexity and heterogeneity), veracity (correctness), and variability (consistency over time).[1],[2],[3]

In the late 20th century, with the rise of information technology and availability of computer systems, the transition from written medical records to digital records started. Doctors have been treating patients for centuries and thus, in the process, have generated an endless repository of data. We can digitize handwritten records, and in the past decade, a lot of records are digital as more hospitals and clinics now prefer Electronic Medical Records. Electronic Medical Records are devoid of any spelling errors, legibility issues, inconsistency of terminology, incompleteness, inaccuracy, and unnecessary information.[4] We have accumulated enormous data, and this dynamic process enriches the repository every moment. We have data from hospitals and clinics; outpatient and inpatient departments; pharmacies; pathology, biochemistry, and microbiology laboratories; radiology department; superspecialties such as pulmonology, infectious diseases, neurology, nephrology, rheumatology, oncology, endocrinology, and gastroenterology; nursing records, physician notes, orders, and consultations; patient measurements and patient care procedures; medical devices data and patient monitors; and others.

The arriving data are anonymized, and all patient identifiers are removed maintaining integrity, privacy, and confidentiality of data. Anonymization ensures the protection of patient’s rights, safety, and wellbeing. The multiple datasets are obtained in structured, semistructured, or unstructured form as every facility has its own way of storing data. Moreover, there are different patterns for writing a prescription, patient notes, and investigation reports comprising nonstandardized medical terminologies and abbreviations. There could be different codes for diagnosis, biomarkers, and other variables. Big data analytics uses all accumulated data on all available variables. Data should be easily aggregable at the level of the patient, prescription, biomarker category, diagnosis, treatment, drug classes, molecule, and brand manufacturer. This unstructured data is restructured (increasing the number of available variables for analysis) and then analyzed by using programming languages such as R Studio, Python Jupyter, C++, and Java. To get the data structured, it is hard to make one single datasheet. It is difficult to manage such an extensive datasheet. We make multiple smaller subsets (connected through unique IDs) from the master datasheet. For example, multiple tables may include patient profiles, prescription details, drug/molecule/class switching, duration details, diagnosis, consultation visit details, billing costs, and dosages. All tables are connected to each other. To get actionable insights sometimes, we need to use two different tables for a specified indication, for instance, what is the costing differences between chronic kidney disease and coronary artery disease patients. This requires diagnosis and billing cost tables. Making the working data smaller allows for quicker and accurate results.

For instance, there is an oral hypoglycemic agent (OHA) that exhibits glycemic control and has renoprotective effect. After a year of its launch, the OHA shows decline in the sales, and to understand the volume of data we have for diabetes patients, we need to know how many prescriptions include the OHA and how many kidney function test data we have, all for patients of a particular age, sex, or region. When we have adequate data, we can analyze demographic characteristics. This analysis can help us to assess the heterogeneity among different groups. Then we can compare which group is consuming the OHA and the trend of glycemic control over time. We try to understand patients who are switching to the OHA of interest and those who are switching to other OHA. If required, cost analysis could also be done. After understanding these patterns, doctors, strategy team, and data scientists can comprehend the problem and take systematic approach to improve the product utilization. Assume that we want to make a new diabetes drug as a fixed-dose combination with two different drug classes [e.g., glucagon-like peptide-1 (GLP-1) analogues and dipeptidyl peptidase-4 (DPP-4) Inhibitors]. As there are many molecules in both the classes, to get the best combination, we need to find out what are the side effects of taking all possible combination of any two molecules together, one from each class. We will also be interested in knowing what are the positive impacts of taking two molecules from different classes compared to taking only one molecule or taking two molecules but as two different drugs.

To get the analysis right, we need to understand the behavior of patients and pattern differences between gender, duration of disease, severity of disease, and age groups. Such analysis helps in identifying unmet need and specific targeting of the drug. The end result of the analysis is generated in the form of an interactive platform wherein the user can apply filters as per their need and get immediate results.

Big data analytics does not give causality; instead, it gives a correlation and identifies any trend or pattern in complex data. Big data analytics is useful for patients, healthcare professionals, pharmaceutical companies, researchers, public health professionals, and policymakers. For patients, it is useful to identify the risk of a particular disease, adequacy of current therapy, or need for a switch and development of personalized medicine. Healthcare professionals can use it as clinical decision support system in making an initial clinical decision and then changing it as required. Healthcare providers can use it to develop various patient engagement platforms with interventional messages aimed at changing patient behaviors, improving compliance, and enhancing patient outcomes. We can use mobile devices or other internet-accessible devices for engaging patients. Pharmaceutical companies can use it for generating leads and repositioning of existing molecules. It can easily perform safety surveillance for drugs and medical devices. Researchers use it for identifying knowledge gaps, public health professionals can use it to generate trends and patterns that can be missed in small datasets, and policymakers can use it for targeted resource allocation.

Big data analytics predicts future trends as it learns from existing data to generate actionable insights based on correlations across time, geographic region, or a particular subset of the population.

Financial support and sponsorship


Conflicts of interest

There are no conflicts of interest.


1Belle A, Thiagarajan R, Soroushmehr SM, Navidi F, Beard DA, Najarian K. Big data analytics in healthcare. Biomed Res Int 2015;2015:370194.
2Ristevski B, Chen M. Big data analytics in medicine and healthcare. J Integr Bioinform 2018; 15(3).
3Lee CH, Yoon HJ. Medical big data: promise and challenges. Kidney Res Clin Pract 2017;36:3-11.
4Zhang XY, Zhang P. Recent perspectives of electronic medical record systems. Exp Ther Med 2016;11:2083-5.