Are doctors created equal? An investigation of online ratings by patients Guodong (Gordon) Gao, University of Maryland Jeffrey S. McCullough, University of Minnesota Ritu Agarwal, University of Maryland Ashish K. Jha, Harvard University Introduction There is broad consensus among policy makers and consumer groups that greater transparency in healthcare will improve the quality and costs of care delivered. In recent years several government-led transparency initiatives have focused on healthcare institutes like hospitals (Jha et al. 2005, Harris and Buntin 2008). Up to now patients still have very little means of discerning the quality of physicians, even though there is evidence that the variation of quality across individual physicians can be substantial (Gawande 2002). While national efforts at physician performance reporting have seen slow progress, a new phenomenon has begun to fill the gap. Fueled by the wide spread of the Web 2.0 technologies, online physician ratings are gaining momentum in recent years. Take RateMDs.com, one of the most popular doctor rating websites, as an example. RateMDs.com has grown from about 3,000 ratings in 2005 to over 360,000 by January 31 st, 2010. Traditional review websites are joining the bandwagon as well. Angie s list found that 76% of its 600,000 subscribers wanted more information about physicians, so it added the function to allow members to rate physicians in April 2008. Even Zagat, famous for restaurant reviews, is rolling out its physician ratings in 2009. Additionally, there is evidence that the physician ratings are being used by patients. A recent survey by PEW finds that 61% of US adults have looked online for health information, and among them 24% have consulted rankings or reviews online of physicians or other providers (PEW, 2009). These ratings have generated worries among doctors and healthcare organizations. Advocates argue that such rating systems will provide consumers much needed information about physician quality (at least from the consumer experience perspective). Critics worry that the internet rating sites will be a forum for disgruntled patients to vent minor shortcomings and a small number of such ratings might tarnish physicians reputation (McCartney 2009, Miller2007). Professional societies such as the American Medical Association and even some state governments have expressed concerns about these rating programs (Dolan 2008, Martin, 2009). The controversy is not limited to the United States. Starting October 2009, the National Health Service (NHS) in the United Kingdom have encouraged patients to rate their general practitioners through an NHS-run website, which also created a huge debate. Given the above controversies and the tremendous potential for these ratings to impact physician livelihood and patient behavior, there is limited understanding of the nature of the ratings. In this study we aim to provide insights to two fundamental questions related to doctor ratings: 1
(1) What factors influence a doctor s likelihood of receiving ratings from patients? and (2) Do the ratings reflect the true quality of doctors? Answers to these two questions not only have important policy implications, but also contribute to the research community on online word-of-mouth (WOM). The wide spread of Web 2.0 technologies in recent years has enabled a rapidly increase of online word-of-mouth. Consumer-generated ratings over products and services are now becoming commonplace. As a result, the past decade has witnessed an increasing number of studies on online reviews by consumers (e.g., Godes and Mayzlin 2003; Dellarocas, Awad and Zhang 2004; Clemons, Gao and Hitt 2006; Duan, Gu and Whinston; Li and Hitt 2008; Dellarocas, Gao, and Narayan 2010, Zhu and Zhang 2010). All these studies have provided important insights of the relationship between online consumer reviews and sales. On the other hand, we have surprisingly little knowledge about the nature of these reviews (Dellarocas, Gao, and Narayan 2010). For example, why do some products receive more reviews than others? Do these reviews reflect the product s quality, which is often an assumption in existing studies rather than a proven fact? The two questions we seek to explore in this study will shed light on these important issues. Additionally, we expand the realm of online WOM studies from products to healthcare service, the biggest sector of the US economy. Data We develop a novel data set incorporating both physician characteristics and patients online ratings. Our rating data were captured from RateMD.com in Feb 2010. RateMD.com is one of the nations largest aggregator of consumers physician ratings. Our physician data are drawn from Virginia licensing board information. We chose Virginia because it is a relatively large state that provides relatively detailed data on licensed physicians. We matched the ratings database to the Virginia data on the basis of name, address, and specialty. Our sample includes a near census of Virginia s licensed physicians 18,174 physicians actively practicing licensed physicians as of Jan 31 st, 2010. Our dataset contains various measures of physician characters from the Virginia Board of Medicine. These measures include board certification, specialty, experience, malpractice claims, and education background. Board certification includes those recognized by the American Board of Medical Specialties, Bureau of Osteopathic Specialists and Boards of Certification, the American Board of Multiple Specialties in Podiatry, or Council on Podiatric Medical Education. A physician s specialty is from either the board certification, or self-designated practice areas if not certified. As there are over 100 specialties, we group them into five major categories: family/pediatrician, obstetrics and gynecology (OB/GYN), surgery, hospital-based, and other specialties. We also collect physicians year of medical school graduation as a proxy for both age and practice duration. Graduation year is divided into 4 categories: before 1980, 1980-1989, 1990-1999, and 2000-2009. We also obtained the medical school research and primary care rankings from US News and World Report 1. We use a dummy variable to reflect whether the medical school is ranked as top 50 in primary care or not. Finally, we included an indicator equal 1 We report findings of the 2008 ranking. 2
to 1 if a physician has experienced any malpractice claims and equal to 0 otherwise. 2 RateMD.com s physician ratings have four categories, including: staff, punctuality, helpfulness, and knowledge. RateMD.com generates an overall physician quality measure based on helpfulness and knowledge. We focus on the rated overall quality in the following analyses. We did, however, examine all four individual rating measures. We found that ratings were highly correlated. Empirical models We use the following specification to estimate factors that influence the likelihood of a doctor to be rated online: logit(beingrated) = 0 + 1 BoardCertified + 2 SchoolRanking + 3 Malpractice + i Specialty i + j GraduationYear j + (1) In the above regression, three variables, BoardCertified, SchoolRanking and Malpractice are used to reflect the quality of the physician. We include a group of dummy variables to measure a doctor s other characteristics. For Specialty, primary care physicians are the default group, and it contains four dummies for the following groups: Medical specialties, Surgeon/surgical specialties, OB/GYN, and Other specialties. Similarly, for GraduationYear, we make Before 1980 as the default group, and include three dummies for 1980-1989, 1990-1999, and 2000-2009, respectively. To examine what factors influence the value of ratings, we adopt the following model: RatedOVerallQuality = 0 + 1 BoardCertified + 2 SchoolRanking + 3 Malpractice + i Specialty i + j GraduationYear j + (2) To estimate the above equation, we naturally limit our sample to doctors with at least one ratings online. Findings Determinants of the likelihood of being rated online Of the 18,174 physicians in Virginia, 3,164, or 1 out of every 6, had received at least one rating. These Virginia physicians had a total of 10,534 ratings by January 31 st, 2010, with average 3.3 ratings per physician. The unadjusted rate already reveals the substantial variation across specialty, graduation year, and board certification. Among the specialties, 37% of the OB/GYN are already rated online, while only 7% of other specialties physicians have ratings online. Across graduation years, 20% of physicians who graduate between 1980-1999 have received at least 1 rating. This number dropped to only 9% for the young cohort of doctors. Interestingly, board-certified physicians are much more likely to be rated than otherwise (19% vs 10%). On the 2 We also examine whether there are claims after 2005, whether the monetary amount of claims is equal to or above the average in that specialty. These details are available from the authors for more interested readers. 3
other hand, graduates of more highly ranked medical schools were rated with nearly the same frequency as graduates of lower ranked medical schools. Malpractice does not seem to significantly affect the likelihood of rating either. All these findings are further supported in the logistic regression. Table 1. Virginia physicians by quality rating status. Virginia Physician s Unadjusted Rate of being rated (%) Odds Ratio (95% CI) P-Value No. of physicians 18174 17 Specialty Primary care 6540 19 1.00 - Medical specialties 2806 20 0.94 (0.84-1.06) 0.34 Surgeon/surgical specialties 2751 22 1.07 (0.96-1.20) 0.23 OB/GYN 1145 37 2.40 (2.09-2.76) <0.001 Other specialties 4932 7 0.28 (0.25-0.0.32) <0.001 Graduation year Before 1980 5142 17 1.00-1980-1989 5276 20 1.22 (1.10-1.35) <0.001 1990-1999 5184 20 1.16 (1.04-1.28) 0.006 2000-2009 2572 9 0.53 (0.45-0.63) <0.001 Board certification Board certified 15057 19 1.00 - Not Board certified 3117 10 1.62 (1.42-1.85) <0.001 Medical school ranking Ranked top 50* 4962 18 1.00 - Ranked below top 50 13212 17 0.96 (0.88-1.05) 0.42 Malpractice claims With no malpractice claims 16886 17 1.00 - With at least one malpractice claim 1288 23 1.12 (0.97-1.29) 0.116 *based on 2008 US News and World Report ranking. 4
Percentage of Physicians Determinants of the value of ratings First of all, it is interesting to note that the average quality rating, which is based on the physician s helpfulness and knowledge, was high (3.93 out of 5). As shown in Figure 1, 46% of physicians receiving a 5 out of 5, while only 12% of ratings were below 2. Although some physicians are concerned that online ratings will become a channel for disgruntled patients to vent their complaints, our findings suggest that this is not the primary driver for most patients who use the rating system. In fact, given that nearly half the ratings are a perfect 5 out of 5, online ratings appear, at least so far, to be driven by patients who are delighted with their physicians. Figure 1. Distribution of overall quality ratings across physicians. 50% 40% 30% 20% 10% 0% 1.0-1.9 2.0-2.9 3.0-3.9 4.0-4.9 5 Rated Overall Quality Next we examine factors that affect the value of ratings. We report the OLS estimates for Equation (2) in Table 2. Because the ratings are limited between 1 to 5, we also apply a Tobit model and get virtually the same findings. We found modest effects of specialty on the quality rating physicians received. Virginia physicians classified as other specialties had moderately lower ratings (3.63) than other physician specialty categories (Table 3). The differences among the other specialists, such as primary care physicians (4.04), medical specialists (3.95), surgeons (3.90), and Obstetrician/Gynecologists (4.04) were small and not significant. Younger physicians those who had graduated from medical school after 2000 had significantly higher ratings than older physicians. While there were small differences across the different age cohorts (3.85 for physicians graduating before 1980, 3.95 for those graduating in the 1980s, and 3.99 for those graduating in the 1990s), the youngest cohort had an average rating of 4.22 (pvalue for differences across group <0.001). Board certified physicians had somewhat higher ratings than physicians who were not board certified (3.96 versus 3.86, p=0.04). Similarly, physicians graduating from a top-50 medical school had somewhat higher ratings than other physicians (4.08 versus 3.91, p=0.002). Physicians with no history of paying malpractice claims were rated somewhat higher than physicians who had at least one malpractice claim, although this difference did not reach statistical significance (p=0.099). 5
Table 2. OLS regression of the value of quality ratings on physician characteristics Coefficient (95% CI) P-Value Specialty* Medical specialties -0.07 (-0.19-0.06) 0.279 Surgeon/surgical specialties -0.11 (-0.24-0.01) 0.068 OB/GYN 0.01 (-0.13-0.15) 0.875 Other specialties -0.389 (-0.54 - -0.23) <0.001 Graduation year** 1980-1989 0.09 (-0.02-0.21) 0.100 1990-1999 0.12 (0.00-0.23) 0.047 2000-2009 0.41 (0.22-0.61) <0.001 Board certified 0.16 (0.01-0.31) 0.037 Medical school ranking in top 50 in primary care 0.16 (0.06-0.26) 0.002 With at least one malpractice claim -0.13 (-0.28-0.02) 0.099 * Primary care physicians are the comparison group **Before 1980 is the comparison group Discussion and future research We examined physician ratings on the largest user-submitted physician review website in the U.S. We found dramatic growth in the number of physicians being rated (now 1 in 6 practicing doctors in Virginia). Its penetration differed based on physician specialty: obstetrician/gynecologists were far more likely to be rated than others with nearly 1 in 3 such physicians now having an online rating. Additionally, physicians with less direct patient contact (such as pathologists or radiologists) were infrequently rated. Not surprisingly, young physicians are less likely to be rated, possibly due to smaller patient base. As to the value of ratings, we found that physicians who were board certified and from topranked medical schools had generally higher ratings than other physicians, suggesting that the ratings to some extent, reflect the quality of the doctor. The fact that physicians with a history of paid malpractice claims had slightly lower ratings further supports this explanation. This study is the beginning to unveil the complex nature of consumers rating generation process, including what they rate and how they rate. We are actively extending the current finding in several important ways, and hope to report these findings at WISE. (References omitted due to the page limit) 6