Have you ever had at least one wearable internet technology or downloaded a health-related app on your mobile device for monitoring your health status? Beyond providing you with real-time health information, your wearable technologies can create a cloud of big personal data. This is not just limited to your health apps; usage of search engines, social media, and entertainment, such as Google, Youtube, Alibaba, Instagram, Facebook, Netflix, Amazon, and Twitter, creates vast amounts of data about “you.” In addition, advancement of sensor-equipped technologies will accelerate data collection on the ways we live, work, and who we are. According to Klaus Schwab, the founder and executive chairman of the World Economic Forum, this is just one part of the technological revolution that will be brought forth by the fourth industrial revolution.
In the business world, all this big data enables us to perform better predictions on consumer needs. Big data refers to the large quantities of information collected through careful observations of repeating patterns. It is an essential component for artificial intelligence, which utilizes deep machine learning algorithms to digest big data. More data will also result in better prediction outcomes. For instance, this principle can be clearly applied to the insurance industry. Data associated with health and driving habits can be collected and used for determining insurance charges and guiding decisions on policies.
How about big data and the field of medicine? This powerful analytics tool can be used to enhance the quality of healthcare, which may include the development of diagnostic subgrouping and predictive biomarkers for various therapeutic options. Big data can also be applied to the management of clinical information. Systems for electronic medical record (EMR) are being implemented in the hope of yielding timely and clinically valuable data from a patient’s clinical chart without repetitive chart review [1]. This wealth of data will allow for clinicians to have the tools needed to make appropriate clinical decisions. It could be even greater if we have cost-effective and maximized capabilities to retrieve necessary clinical data from multicenter EMR in real-time.
In the current health care system, there has been accumulating efforts to optimize precision medicine, also known as personalized medicine, across all medical specialties, including urology. A patient’s omics signature, which is retrieved from genomics, epigenetics, proteomics, transcriptomics, and/or metabolomics profiles, can independently or jointly identify different patient phenotypes. By combining clinical and medical information, Big data-based signatures can support physicians on deciding therapeutic plans unique to individual patients.
Recently, our integrated classification of prostate cancer using a large cohort of 1,321 human prostate cancer transcriptome profiles from 38 different cohorts revealed that pathway activation signatures can be a new classification system for subgrouping prostate cancer into 3 different types (PCS1, PCS2, and PCS3). This subtyping is based on distinct gene expression patterns. Our study further showed that these 3 different prostate cancer phenotypes had different clinical outcomes; PCS1 tumors progressed more rapidly to metastatic disease in comparison to PCS2 or PCS3 [2]. Subtyping has also been applied to other cancers as well. Molecular subtypes of bladder cancer have been identified using comprehensive genomic data from The Cancer Genome Atlas, which is a massive coordinated gene data bank with contributions from all over the world. Specific DNA mutations and/or copy number aberrations were found to be unique to each molecular bladder cancer subtype, suggesting that personalized management of patients is possible [3]. This is just some of the many examples on how big omics data can be used to provide personalized medicine. However, in order to enhance the utilization of big data in medicine, improved accessibility to robust, well-annotated, and interactive human omics data is a critical departing point [4]. Development of deep machine learning pipelines is also a crucial step for interpreting data. Machine learning can provide pattern recognition of information from patient charts, medical records, and clinical trials. It can produce accurate disease prediction better than traditional statistical regression models [5]. If successful, these tools can broaden scientific insight and arm clinicians with the knowledge on how to deliver personalized care.
The ultimate goal of big data is to access accurate and meaningful material regarding patient symptoms and family history, and use that data to provide real-time information about patients, their clinical outcomes, and the quality of care needed. However, there are some issues and problems that can arise from this technology. Big data is still elusive–How can individual privacy be protected? How can such rich data be interpreted? How can this information be explained to patients and caregivers who do not have a scientific background? Strategies on addressing big data are fully dependent on how the next generation of researchers and physicians are trained for analysis and interpretation. We are currently limited by this need for those specialized in communicating big data to physicians and patients. Currently, many medical schools have initiated establishing an extensive educational curriculum to foster specialists in this fourth industrial revolution era [6]. There is a potential meaningful use for big data in the future ahead and the possibilities are exciting, but getting there is the challenge.
So…are we ready for the next big shift?