The Big Data Revolution Physicians and hospitals have more information than ever about patients. That data has the potential to improve care and save lives but it also has its pitfalls. By Kimberly Turner Big data has the potential to vastly change the field of cardiovascular health care in a variety of ways from helping to identify risk factors and streamlining trials to improving patient care and monitoring how well people respond to therapies. But what exactly do we mean by the term big data? The most basic answer is exactly what it sounds like: Big data is a set of information that is too large to be handled by normal data processing programs. Thanks to electronic health records (EHRs), databases, registries and other sources, we are gathering information about patients health at an unprecedented rate. The typical hospital generates about 100 terabytes of data every year, according to James Tcheng, M.D., interventional heart specialist at Duke University. If that doesn t mean much to you, consider this: The entire Library of Congress contains only 10 terabytes of text data. All of the information in the world is useless, though, unless it can be interpreted and analyzed in meaningful ways. That s where big data comes in. Experts are working to create advanced statistical and mathematical methods to draw conclusions that can be used by doctors and hospitals to guide medical decisions. 10 HEARTBEAT WINTER 2015
You re Already Benefiting from Big Data The process sounds complicated (and, truthfully, it is), but the results can be simple. Most people already benefit from big data every day. When you use your grocery store loyalty card and receive coupons for products you might enjoy, those suggestions come from the analysis of millions of shoppers buying habits, aka big data. When you visit Netflix, itunes or Amazon and receive recommendations about what to buy based on your previous purchases and ratings, that s big data in action. If we can do those sorts of fancy algorithmic predictions when it comes to movie rentals, why aren t we getting daily, individualized medical advice from our computers and smart phones yet? In short, because no one gets seriously injured when Netflix recommends the wrong movie, but applying big data conclusions to individual patients can be ineffective at best and deadly at worst. Proceeding with Caution John Rumsfeld, M.D., Ph.D., is chief science officer at the American College of Cardiology s National Cardiovascular Data Registries (NCDR), the preeminent cardiovascular data repository in the U.S. He explains, Big data needs to be thought of like a new technology, new medical device, or new drug. It needs the same sort of rigorous evaluation to know that when we use it, it will actually improve care and outcomes and not have unintended consequences. For example, if the models built off of big data platforms don t correctly identify patients who need a certain therapy or misclassify people as being at high or low risk, you could imagine there being unintended harm. Hospitals and doctors are being told that big data will help guide their decisions, but that guiding will only be as good as the underlying models. We should have enthusiasm because we re getting a new tool to help us potentially understand patterns of care and predict which patients might be at the highest risk. All of that is tremendously exciting. I just think as it gets down to making decisions about individual patients, we should subject that to rigorous evaluation. I m worried that the enthusiasm may be outweighing the evidence, so it s important for professional societies, such as the American College of Cardiology, and academic research organizations to partner and work with big data and bring a scientific rigor to it, which NCDR is committed to doing. The stakes in health care are higher. Dr. Matthew Oster says that by using existing data, researchers and physicians can streamline medical trials so that they re faster and more cost-efficient than prospective, randomized, multi-center trials. Dr. John Rumsfeld cautions that health care providers must carefully evaluate the data so that it improves patient care, not hinders it. Databases and registries are being used to speed up the process of identifying qualified subjects and, in doing so, can also reduce the cost of clinical trials and the time it takes for new drugs and treatments to be brought to market. WINTER 2015 HEARTBEAT 11
Beating World s Biggest Killer with World s Best Technology By Kimberly Turner new study by University of California San Francisco, in A collaboration with the American Heart Association, is taking big data to the next level by combining clinical data and electronic health records with crowd-sourced information from surveys, participants smart phones, wearable fitness devices, and high-tech gadgets. The result is Health eheart, the largest and most ambitious study on heart disease ever conducted. With about 30,000 participants currently signed up (compared with 5,209 in the first generation of the Framingham Heart Study), Health eheart may eventually gather data from as many as a million people. Those participants, who vow to take part for at least 10 years, will have the ability to link their medical records to the system as well as provide real-time metrics via their computer or mobile device. The thought behind the study is that to beat the world s worst killer, you need the world s best technology and that every bit of data brings us closer to saving lives, according to the study s website. Jeffrey Olgin, M.D., chief of cardiology at UCSF and a principal investigator in the Health eheart study, explains more about the decision to use a crowd-sourced approach: Clinical research has become so expensive and the time it takes to get something done is so long, so we really wanted to develop and test a new paradigm for doing clinical research that was faster, bigger and more nimble. Dr. Jeffrey Olgin leads the The second piece is that in order Health eheart Study, which combines clinical data and to make discoveries that are more electronic health records with than just sort of the low-hanging crowd-sourced information fruit, we really need big numbers, from surveys, participants smart phones, wearable fitness devices and it s really expensive to do a and high-tech gadgets. large-scale clinical research study with traditional methods. He and his collaborators also believe that getting data from the patient on a more regular basis has the potential to tell a much richer story. Anyone even those with no history of heart problems can register to take part in this study at health-eheartstudy.org. Focusing on Populations, Not Individual People While it may be a few years before heart patients can wear high-tech devices that provide accurate individualized alerts when they are at high risk, big data is already benefiting cardiovascular patients on a larger, population-level scale. Research is one realm where big data is already making a difference. Matthew Oster, M.D., M.P.H., Director of Children s CORPS (Cardiac Outcomes Research Program at Sibley Heart Center) at Children s Healthcare of Atlanta, says that one of big data s largest advantages is that It can help streamline the ability to do larger trials without having to do a prospective, randomized, multi-center trial since that can be very time- and cost-prohibitive. By using existing data, we can use advanced methods to try to answer some of the burning questions in a much quicker and cost-effective manner. Databases and registries are being used to speed up the process of identifying qualified subjects and, in doing so, can also reduce the cost of clinical trials and the time it takes for new drugs and treatments to be brought to market. Thanks to enormous, crowd-sourced endeavors such as University of California San Francisco s Health eheart study (see sidebar), certain non-drug interventions can be tested almost in real time. Jeffrey Olgin, M.D., chief of cardiology at UCSF and a principal investigator in the Health eheart study, describes a recent example: A collaborator approached us and said, We have this way to help reduce smoking and we want to test it. It could be really impactful and scalable, so we said, Sure. We have several thousand daily smokers in our study and we can reach them instantly to deploy the intervention and tell whether it worked. We went from the idea of that intervention to starting the study in a three to four week period. 12 HEARTBEAT WINTER 2015
Using Big Data for CHD Care Jeffrey Jacobs, M.D., professor of surgery at Johns Hopkins University and chair of The Society of Thoracic Surgeons Congenital Heart Surgery Database, says benchmarking outcomes against national aggregate data is, in his opinion, the most useful role of big data for pediatric cardiac surgery: That benchmarking allows opportunities to identify areas that would benefit from quality improvement initiatives and allows the across-the-board identification of programs that are performing better than expected and worse than expected. Quality improvement initiatives can be established in programs that are performing worse than expected, and we can learn from the programs that are performing better than expected to improve quality across the board. The Congenital Cardiac Interventional Study Consortium (CCISC) registry, designed to determine best practices for the treatment of pediatric patients with congenital heart disease (CHD), is an example of those sorts of quality control efforts in practice. Co-founded by Thomas Forbes, M.D., pediatric cardiologist at the Children s Hospital of Michigan, CCISC s databases currently include information from around 30 institutions in the United States, Europe and South America. That number is expected to double in the next six to 12 months. By comparing the results from institutions around the world, CCISC s use of big data has already had a noteworthy impact on the complication rates of pediatric congenital heart defect patients. Dr. Forbes says that using the data to identify the need for a smaller catheter for babies weighing less than 10kg has vastly decreased the number of injuries and complications and that comparing surgeries performed with and without body warmers has dramatically changed the complication rates of infants and babies undergoing procedures in South America. Those are We ve seen a dramatic drop, both in complication rates as well as radiation dose by 50 percent. That s a huge amount over the lifetime of a child who might have 13, 15, or 20 procedures. That s the advantage of this database. Dr. Thomas Forbes Big Data at a Glance The typical hospital generates about 100 terabytes of data every year. By comparison, the entire Library of Congress contains only 10 terabytes of text data. Big data needs rigorous evaluation so that when it is used, it will actually improve outcomes and not have unintended consequences. Databases and registries can help speed up the process of identifying qualified subjects and, in doing so, can reduce the cost of clinical trials and the time it takes for new drugs and treatments to be brought to market. Not only is much of the available data in inconsistent formats and disparate systems, even the terminology and measurements used are variable. The consistency, quality and availability of information continues to improve. By 2018, the government will require electronic health records to fulfill meaningful use requirements that will make the advantages of big data more attainable. two examples, he says, of how the database showed us who was at high risk and what we can do to make that better. Consistency Is Key The registry was also able to significantly reduce the amount of radiation some patients are exposed to over their lifetimes, but before that conclusion could be reached, CCISC had to overcome one of the biggest pitfalls in medical big data analysis: lack of WINTER 2015 HEARTBEAT 13
In 2014, President Obama signed the Gabriella Miller Kids First Research Act. The bill ends taxpayer financing of political party conventions and redirects that money to a pediatric research initiative through the National Institutes of Health. The Kids First Pediatric Research program includes a vast database that allows researchers to better understand birth defects, such as congenital heart defects, and pediatric cancers. consistency. Not only is much of the available data in inconsistent formats and disparate systems, even the terminology and measurements used are variable. Dave Fornell, editor of Diagnostic and Interventional Cardiology, says, One hospital may call the imaging system in its cath lab a fluoroscopy system, another may call it angio or angiography, vascular imaging system, digital angiography, flat panel detector they all mean the same thing. What one person might say is one part of the anatomy may be called something else by another doctor. They re going to have to come up with standard taxonomy in order for the datapoints to mean the same thing across the country and across hospitals. That s just what CCISC had to do before it could accurately be used to compare radiation levels among patients around the world. Dr. Forbes explains, We said, First of all, we all have to measure radiation in the same way. So we got five nuclear physicists and said, We can only choose one parameter. What is it going to be? They researched it and came up with one measurement. We went to all the vendors that provide the cath lab systems and said, Your system has to measure this. That s the way it s going to be. They changed their systems so that they all measured radiation the same way. Once they did that, we were able to compare institutions as far as overall radiation dosage and break it down for different procedures. The analysis showed that some institutions were using much more radiation than others, and thanks to that knowledge, Dr. Forbes says, We ve seen a dramatic drop, both in complication rates as well as radiation dose by 50 percent. That s a huge amount over the lifetime of a child who might have 13, 15, or 20 procedures. That s the advantage of this database. Data Quality The quality of data is also a major concern to big data experts. Fornell says, You have the old saying: garbage in/garbage out, so while we have all these advanced tools to manipulate the data and analyze it and try to interpret it, if we don t have good quality data to begin with, everything else is going to be garbage. We need to make sure we re measuring what we say we re measuring, classifying appropriately, and truly getting accurate and reliable information before we do anything with it. The consistency, quality and availability of information continues to improve, and by 2018, government reform policies will require EHRs to fulfill meaningful use requirements that will make the advantages of big data more attainable. With careful use of this ever-increasing river of information, big data has the potential to identify new risk factors for heart disease, decipher the links between family history and cardiovascular events, find correlations between environmental factors and heart disease, increase quality of care around the world, assist physicians with making the best possible choices, and monitor the effectiveness of various treatments and procedures over time. 14 HEARTBEAT WINTER 2015