RAND: patient preferences should inform hospital rankings

RAND Corporation researchers have effectively endorsed key elements of the Value Ratings method for hospital rankings.

The article appears in the New England Journal of Medicine. The authors call for tailoring summary performance ratings by incorporating the preferences and needs of individual users.

Followers of this blog will recognize this counsel as being in near perfect accord with Value Ratings.

Hospital rankings: how RAND aligns with Value Ratings

The similarity between what the RAND researchers write and our Value Ratings methodology is remarkable and significant. Some article excerpts:

  • A report tailored to the “average patient” will probably be a poor fit for most.
  • [As] currently constructed, the weighting systems that underlie overall hospital performance ratings are expressions of the values, preferences, and tastes of their creators.
  • One-size-fits-all weighting…can be replaced with user-determined weights in the Internet age.
  • By allowing such personalization, creators of performance reports can enhance the value of their overall ratings and rankings to the consumers who might use them.

Next, the RAND researchers demonstrate how hospital rankings would change for two fictitious people, Patient A and Patient B. This closely parallels our fictional Alice and Bob, who rate medical groups differently using real performance data produced by the Washington Health Alliance.

The researchers imagine a scheme where a user assigns weights to performance domains in isolation from each other. We have noted this procedure as being prone to introspection risk. The authors also appear to recognize this pitfall. They suggest suitable weights might someday be “based on the results of a questionnaire.” Here, we see a clear parallel to the survey that reveals your Preference Profile. It detects and quantifies both the priority and relative importance of your preferences across multiple areas of performance. Most importantly, this includes untangling the interplay between areas.

Better options for people seeking healthcare

RAND’s counsel could be good news for healthcare consumers. Soon, the decision aids made available to patients might acknowledge their situational needs and personal values.

How Amazon could change health care

Wondering how Amazon could change health care? A post by Mary Jo Condon, MPPA, disputes a Harvard Business Review article that calls for common, agreed upon definitions of “value.”

She asks: Why shouldn’t people define value on their own terms? After all, our circumstances and needs can change, so people ought to be empowered to specify their own value “recipe.”

Value Ratings make her solution possible.

Excerpts from her excellent post:

The Harvard Business Review recently published an article titled, “We Won’t Get Value-Based Health Care Until We Agree on What ‘Value’ Means.”

The title begs a simple question, “Why do we have to agree?”

Why can’t patients and the employers and health plans that foot much of their health care bills make individual and situational determinations of value?

In every other product and service, consumers choose the relative mix of cost, quality and features that define value for them – for that product or service and on that day.

Think about shopping for hotels. You pop online and adjust sliders and filters to the right mix of price, amenities and star ratings for that trip. For me, sometimes I’m more price conscious. Other times, I prioritize location. Every once in a while, I’m just feeling fancy. Data and technology allow me to make informed, personal and situational choices in a matter of seconds.

If you’ve ever wondered how Amazon could change health care, this may be your answer.

You can read her entire post here.

Using real data, we have shown with Value Ratings that different people would favor different medical groups based on their differing personal preferences.

Value Ratings enable people and organizations to acknowledge and assert their unique situational determinations of value. For health care, this might include quality, appropriateness, patient experience, cost, effectiveness, availability, and so on.

Why would we compel different people to use the same, one-size-fits-all “value recipe” before selecting medical groups? Or most anything else, for that matter…

Value Ratings convert popular school rankings into useful ones

The Financial Times has published their school rankings for graduate business programs. Coming out on top was Stanford University.

So, as a prospective student, should you target Stanford for your business degree? There are suggestions the answer is yes, but a careful reading shows no such direct advice.

Rather, a moment’s common sense leads to the proper reaction: “It’s hard to say if Stanford is really best for me. It depends whether Financial Times have incorporated my values into their ranking system.”

School Rankings at FT.com

Let’s examine how FT.com distill performance across 20 categories to arrive at their final rankings. We quickly discover, unfortunately, the dubious but extremely popular one-size-fits-all weighting scheme. It implies that all aspiring business school students have identical preferences for each of the 20 performance categories.

Below is a detailed look at the points-based weighting behind the FT.com final rankings:

The above point assignments are fixed.  They apply to you, me, and anyone else who might turn to FT.com for guidance on leading programs. So, you are correct to ask: What do these fixed weights imply about my preferences? And are those implications accurate?

For example, below are three observations lifted from the points table, and their implications:

Implication#1: You value the volume of published faculty research 5 times greater than whether alumni recommend the program. (Is this true for you?)

Implication#2: For you, value-for-the-money is only half as important as international mobility. (Does this describe you?)

Implication#3: You believe graduates’ starting salary is about 7 times more important than whether graduates say they achieved their MBA-related goals. (Agree?)

It is easy and instructive to compare the points for any two performance categories, and ask yourself: “Does this describe me?” Granted, this exact points scheme must capture someone’s preferences perfectly. However, for most everyone else, a fixed weighting scheme is blunt and possibly misleading.

What Gives? Is there a Better Way?

So, who uses one-size-fits-all school rankings? The principal audience is most likely the schools themselves. And we need only turn to the Financial Times itself for confirmation: their own podcast service, Alphaville, recently interviewed author Will Davies on the topic of expertise, in which the Financial Times interviewer makes a telling observation:

The example that always springs to my mind with this kind of thing is university rankings. Because it seems to capture what’s going on with a lot of expertise — which is that the mechanism of university rankings is still a very powerful force, even if no one at any university believes in them, which they don’t. Because they’re arbitrarily constructed. There’s all these kind of imperfect weightings. You have a conversation with somebody and they’ll say, “These university rankings are nonsense.” And a day later they’ll say, “We’re absolutely delighted — we’re third in the league table.”

There is a better way. Implement Value Ratings. Instead of catering to the producers, Value Ratings give consumers a tailored, personalized ranking of performance that acknowledges their unique Preference Profile.

Amazon HQ2 and Value Ratings

Amazon’s recent announcement it will seek a second company headquarters is an ideal opportunity to apply the Value Ratings method. Using publicly available data, Value Ratings show why accounting for preferences is critical when ranking complex choices.

In addition to preferences, methods matter, too. The New York Times’ approach — which concluded Denver is the ideal locale — illustrates a pitfall of “sequential filtering.” With each added restriction, the pool of remaining options shrinks, making choice less taxing. The price, however, is we lose sight of excluded options for the rest of the process.

In contrast, the Value Ratings method never discards an option because of weak performance in a single area. After all, weak performance in an area of little importance to you could still result in a strong Value Rating.

Amazon identified four specific attributes it valued in a metropolitan location for its HQ2:

  1. Metropolitan areas with more than one million people
  2. A stable and business-friendly environment
  3. Urban or suburban locations with the potential to attract and retain strong technical talent
  4. Communities that think big and creatively when considering locations and real estate options

Of course, we cannot precisely know Amazon’s preferences across these four attributes. Instead, we will imagine two possible “corporate personas” to illustrate the critical role played by preferences.

 

Persona #1: Satisfy the Investors

In this persona, we complete the Preference Survey as if we were an organization primarily concerned with making Amazon stockholders wealthier. Such an organization’s Preference Profile might look like this:

preference profile

 

Based on these preferences, the fifteen cities with the highest Value Ratings are:

cities-highest-value-ratings

 

Persona #2: Good Corporate Citizen

In the second persona, we complete the Preference Survey as if we were an organization seeking to cultivate an urban working/living district that will benefit not only investors but also customers, suppliers, and the community at large. Such an organization’s Preference Profile might look like this:

benefit-investors-and-customers

 

Based on these preferences, the fifteen cities with the highest Value Ratings are:

cities-value-ratings-benefits

 

Comparing the Top Performers from Each List

We expect that cities performing strongly across the board would rate highly in any case. We see this for Atlanta and San Francisco. Seattle is included as point of reference because, according to Jeff Bezos, Amazon founder and CEO, “We expect HQ2 to be a full equal to our Seattle headquarters.”

Beginning with 4th place, however, the lists diverge.

Some cities appear only in the Satisfy the Investors list:

  • Dallas-Fort Worth-Arlington, TX
  • Miami-Fort Lauderdale-West Palm Beach, FL
  • Riverside-San Bernardino-Ontario, CA
  • Houston-The Woodlands-Sugar Land, TX
  • Tampa-St. Petersburg-Clearwater, FL
  • Orlando-Kissimmee-Sanford, FL

Other cities appear only in the Good Corporate Citizen list:

  • Philadelphia-Camden-Wilmington, PA-NJ-DE-MD
  • New York-Newark-Jersey City, NY-NJ-PA
  • San Diego-Carlsbad, CA
  • Chicago-Naperville-Elgin, IL-IN-WI
  • Baltimore-Columbia-Towson, MD
  • Austin-Round Rock, TX

Notice the range of Value Ratings for the top fifteen cities differs in each list. Corporate persona #1 ranges from 76.7% to 99.8%, while persona #2 runs from 62.7% to 98.8%. This suggests that the Preference Profile of persona #2, with the wider range, skews toward attributes for which strong performance is less common.

Finally, some cities appear on both lists, usually in different positions:

  • Atlanta-Sandy Springs-Roswell, GA
  • San Francisco-Oakland-Hayward, CA
  • Portland-Vancouver-Hillsboro, OR-WA
  • Boston-Cambridge-Newton, MA-NH
  • Raleigh, NC
  • Washington-Arlington-Alexandria, DC-VA-MD-WV
  • Minneapolis-St. Paul-Bloomington, MN-WI
  • Denver-Aurora-Lakewood, CO

 

Preferences and Methods Matter

To summarize, Value Ratings give a personalized prioritization of all the choices you face. Shortly after Amazon’s announcement, many media outlets presented their evaluations of the deserving cities. These appeals either ignored Amazon’s preferences or presumed to know what they are (or ought to be). As shown above, the pool of better choices is highly dependent on the decision-maker’s preferences.

 

Technical advantages of Value Ratings

This post describes four notable technical advantages of Value Ratings.

An Intuitive and Informative Rating Scale

A Value Rating is a number that ranges from 0% to 100%. So, an option with a Value Rating near 100% performs very well in the areas that matter most to you. An obvious use of Value Ratings is simply to rank your options. However, the Value Rating score itself offers much more. For instance, it reveals whether there is a clear leader, a close or distant contender, whether the overall field is competitive, if there is an absence of a strong entrant, and so on. And for especially critical decisions, Value Ratings offer an assessment of Evidence Strength in addition to sequential rankings.

Bias Protection

The Preference Survey is a series of qualitative questions that quantify your preferences for different kinds of performance. You might ask: Can I simply sit down and, after some thought, arrive at similar quantities? Beware, because this is a risky and unreliable approach. Relying on sheer reflection exposes you to mental pitfalls. One example is called the Availability Heuristic, in which easy-to-recall situations become over-represented. We call this pitfall Introspection Risk, and the Preference Survey defends against it.

Missing Data are Less Problematic

When comparing complex options, it seems reasonable to “collect all the data” before any evaluation. However, data are often not available for each category of performance and for every option. This tends to stall progress. The typical thinking is that complete data is required. Value Ratings lessen this problem. When missing data is associated with areas of little importance to you, some choices may still garner high Value Ratings – if they are performing well in the areas you do care about.

Consistency Checks for Your Responses

As noted above, the Preference Survey converts your subjective responses into statistics representing how much you favor one type of performance over the others. Naturally, it is important to be consistent in responding to the survey. As a precaution, rules automatically check the consistency of your responses. If your responses imply changing preferences over the course of the survey, you will receive a notice to review them before proceeding.

Political advantages of Value Ratings

This post describes three noteworthy political advantages that Value Ratings offer.

Naturally Defensible

Sometimes, in highly competitive or hypersensitive settings, lackluster ratings can be provocative. They can trigger criticism from advocates of the rated entities, be they products, persons, or parties. Fortunately, Value Ratings contain a natural safeguard to fend off criticism from parties disappointed by specific ranking results. The reason is simple but subtle.

Recall that Value Ratings reflect objective performance data combined with what you care about most. Therefore, a grievance against a poor Value Rating is also a criticism of your preferences. Few critics would choose to suggest another person’s preferences are misinformed or ought to change. Instead, Value Ratings steer critics toward a more productive strategy for obtaining high rankings — to strive for strong performance in most or all performance areas.

Enhanced Credibility

In a prior post, we saw how one-size-fits-all ratings fail individual users. By ignoring people’s personal preferences, these popular rating systems can profit by licensing promotional rights to high performers. This creates real or perceived conflicts of interest, undermining the confidence of consumers and other users who seek guidance.

Value Ratings have the opposite effect. Value Ratings reinforce the trustworthiness of the rating organization. Consumers receive personalized guidance, with an open recognition that what suits one person need not suit another. This openness dispels concerns about hidden motives and accentuates the rating organization’s goodwill and neutrality.

Education and Discovery

Value Ratings help people act when faced with complex choices. Prior to choosing, though, powerful learning opportunities arise as people complete their Preference Surveys, and they discover their personal Preference Profiles. You will see how your preferences apportion across seemingly incomparable attributes. This knowledge can be clarifying, reassuring, and conducive to taking action.

Finally, we know that over time, as experience and learning accumulate, our preference can shift. You can easily update your Preference Profile. You will become aware of the attributes that are growing or shrinking in importance to you.

Your preferences alter the market you face

In a recent post, we saw how the medical group rankings for Alice and Bob differed sharply because their individual Preference Profiles differed. There is another way the power of preferences asserts itself. Your preferences alter the market you experience.

Preferences Alter the Market: an Example

In this example, we return to Alice and Bob, and their search for medical groups using real performance results developed and published by The Washington Health Alliance.

In the diagram below, we focus on the top and bottom performers for Alice and Bob, respectively. We want to spotlight differences in the range or variation across all medical groups. Because their Preference Profiles differ, what the market has to offer Alice and Bob differs as well.

your preferences alter the market you face

Let’s begin with Bob. When we compare the Value Ratings for his best and worst performing medical groups, we see he faces a market with a 6-fold degree of variation. However, for Alice the corresponding figure is a whopping 21-fold difference. Even when ignoring the best and worst performing medical groups for each, Alice still faces a market with nearly double the variation that Bob experiences. And remember: these astounding differences exist when considering the identical slate of choices.

How Can this Be?

Compared to Bob, Alice’s Preference Profile places more emphasis on attributes where higher performance appears to be harder to achieve. We might refer to this as the degree-of-difficulty; some kinds of performance are harder than others. Alice prefers such attributes more, compared to Bob. As a result, Alice sees a market of medical groups with more divergent Value Ratings than does Bob.

This illustration further confirms the shortcomings of static, one-size-fits-all rankings, Top 10 lists, and other inflexible approaches to rating alternatives. Without accounting for individual preferences with Value Ratings, we can easily mislead when we intend to inform.

Preferences Matter: Alice and Bob choose medical groups

Alice and Bob are back. Previously, they each needed to buy cars. Now it’s health care. They each must find a medical group for their primary health care. We will see how their preferences matter.

Luckily, Alice and Bob live in a part of the country where a ‘data pooling collaborative’ exists. This is an organization devoted to measuring the performance of health care providers, health plans, and other players in the industry. Alice and Bob live in Washington State and their collaborative is the Washington Health Alliance.

When they were shopping for automobiles, Alice and Bob relied on consumer websites that evaluate various models. When it comes to medical groups, the Washington Health Alliance is their objective source of trusted, neutral performance data.

As with cars, Alice and Bob have different preferences. We give each of them a Preference Survey to see how their Preference Profiles differ – and which medical groups rate best.

The Washington Health Alliance collects data from health plans and employers to measure performance using methods approved by its multi-stakeholder board of directors. For medical groups, there are more than twenty performance measures available. We can organize the measures into four groups, or attributes, that Alice and Bob each will consider in selecting a medical group:

  1. Prevention (screening for diseases)
  2. Disease care (treatments for diagnosed conditions)
  3. Wise use of services (avoiding unnecessary treatments and expenses)
  4. Experience of care (how other patients perceived care delivered in a doctor office setting)

Preference Surveys

As they learned when shopping for cars, Alice and Bob found it quite difficult to judge with precision and confidence how the intensity of importance varies among these dissimilar attributes. Fortunately, we can help Alice and Bob discover their relative preferences by using the Preference Survey. Brief, simple, and non-mathematical, the survey detects both the priority and relative importance of preferences across multiple attributes.

Here are the results from Alice’s Preference Survey:

Preference Profile for Alice

We can see that Alice place a large premium on having a good experience of care (52%). Expertise in prevention and disease care together account for nearly the rest of what matters to her (44%). Alice is least interested in practices that strive to keep costs low and avoid unnecessary services; wise use of services is about one-tenth as important to her as experience of care.

Now for Bob’s Preference Survey results:

Preference Profile for Bob

For Bob, wise use of services is what matters most to him (55%). Moreover, he has a much weaker preference for experience of care than Alice does (23%). Prevention is roughly half as important to him as disease care. Bob’s results tell us, among other things, that wise use of services is about seven times more important to him than prevention.

Preferences Profiles

There is nothing odd about such differing Preference Profiles. Alice is exemplary of what we might call a surveillance and treatment mindset, while Bob represents those who believe that less can be more when it comes to health care. We might even imagine that Alice’s health plan is rich with benefits and shields her from bearing the costs of care; in contrast, Bob’s plan might carry a large deductible and oblige him to share in paying for services. With personal preferences, there is no right or wrong; each person draws on his or her personal knowledge, experience, and unique circumstances.

Now let us turn to the Value Ratings for medical groups, recognizing that Alice and Bob have sharply differing Preference Profiles.

Value Ratings

A Value Rating is a personalized composite measure; each Value Rating is a single number that lets you rank complex choices from most to least desirable. Although Alice and Bob are fictitious, the Value Ratings below derive from real performance data developed by the Washington Health Alliance; they regularly publish results on their dedicated website, www.wacommunitycheckup.org.

Below we display how actual medical groups in Washington State ranked for Alice and Bob, given their differing preferences. In this illustration, we have coded each medical group with a letter.

medical groups ranked with Value Ratings

Although derived from identical underlying performance results, the summary rankings for Alice and Bob differ. This stems from their differing Preference Profiles.

Take a closer look at Medical groups Q and B. They are middle-of-the-pack options for Alice; however, for Bob, these two choices could hardly be more different. To Bob, medical group Q’s Value Rating is more than double that of medical group B. The reason: these two medical groups performed very differently in wise use of services, the attribute most important to Bob. Alice placed little importance on this attribute, so her rankings tone down the performance disparity between these two medical groups.

Preferences Matter

This illustration, using real performance data for actual medical groups, confirms the shortcomings of static, one-size-fits-all rankings, Top 10 lists, and other inflexible approaches. Without accounting for individual preferences, we can easily mislead when we intend to inform.

Preferences Matter: Alice and Bob go car shopping

Alice and Bob each need to buy cars. They are different people, from households with different needs, and – you guessed it – their preferences differ, too. Let’s see why their preferences matter.

Let’s keep this simple and imagine there are four important attributes that Alice and Bob each will consider making their choices:

1. Ride quality
2. Price
3. Crash safety
4. Warranty

Of course, it’s easy to simply declare that all four attributes are ‘equally important’ and move on. But this strategy is almost certainly inaccurate and could even be a major mistake. It is somewhat trickier to be more realistic and definitively rank the four attributes from most to least important. However, it is quite difficult to judge — with precision — how the intensity of importance varies among these attributes. Some people find that task nearly impossible to complete with confidence; they could use a little guidance.

Preference Survey to the Rescue

We can help Alice and Bob discover their relative preferences by using the Preference Survey. Quick and easy to take, the survey detects both the priority and relative importance of your preferences across multiple attributes.

The Preference Survey is indispensable when attributes are dissimilar or lack parity with one another. For example, most people would agree that ‘crash safety’ and ‘ride quality’ are not on par with one another, because health and luxury are not interchangable.

Here are the results from Alice’s Preference Survey:

Preference Profile for Alice

These results apply only to Alice and represent what is important to her. The percentages for all four attributes add up to 100; viewed altogether, these figures constitute her Preference Profile.

The Preference Survey reveals that crash safety is by far the dominant consideration for Alice. Price is next most important, but it is a distant second. The results let us estimate that Alice values crash safety about three times more than price. We can also see that ride quality and warranty, taken together, are still not as important to her as price.

Now let’s see Bob’s Preference Survey results:

Preference Profile for Bob

Price matters least to Bob. He is someone looking for a safe and smooth-riding car. These two attributes account for 82% of what matters to him. Bob’s Preference Profile is very different from Alice, but we knew from the start that they had unique preferences; quantifying the specifics of precisely how they differ is something only the Preference Survey reveals.

Preferences Matter

Mathematical relationships like these are extremely difficult to specify by merely self-reflecting. In a few moments, the Preference Survey untangles and measures your underlying preferences. Perhaps best of all, taking the Preference Survey is a non-mathematical experience.

There are no right or wrong responses to the Preference Survey. By now, it should be obvious that Alice and Bob will in all likelihood select different cars as best, even if they refer to the same performance information, say from a consumer website. This is why popular ‘best-value’ designations are essentially meaningless in practice.

One-size-fits-all rankings: Why do they persist?

Common sense tells us that the same “Top Ten” list of hospitals or automobiles cannot possibly apply with much success to individual persons. So, why do these one-size-fits-all rankings persist? One reason is money. Rating organizations that publicize their accolades often rely on a business model that forces them to adopt these inflexible techniques.

The One-Size-Fits-All Business Model

Here is how that business model works. Suppose a rating organization is judging widgets. The organization ranks the widgets from best to worst, using some fixed-weighting rubric. The fixed weights imply that all consumers have identical preferences across the different widget features.

Next, the organization may contact the manufacturers of the highest-performing widgets to congratulate them. They may invite the manufacturer to license the right to cite their high ranking in marketing and promotional materials.

Today, we can see this business model in healthcare, particularly for hospital ratings. Jordan Rau’s 2013 Kaiser Health Network article, excerpted below, illustrates the practice:

Here’s another surprise for consumers: Healthgrades, U.S. News, and Leapfrog charge hefty licensing fees to hospitals that want to advertise their awards. The fees can range from $12,500 to the mid-six figures. Healthgrades, for example, told NYU Langone Medical Center in Manhattan it would cost $145,000 to use its citations, according to Dr. Andrew Brotman, NYU’s chief clinical officer.

MedStar Health, which runs three hospitals in the District and seven in Maryland, pays $70,000 to US News to use its name and awards when promoting its hospitals, says Jean Hitchcock, MedStar’s vice president for public relations and marketing. With such an economic structure, it’s no surprise that the raters keep finding more ways to rank hospitals.

It’s easy to see that this licensing strategy would fall apart if ratings were flexible, reflecting the preferences of each user.

A Personalized Composite Measure

Now contrast one-size-fits-all ratings to a system producing personalized scores for each consumer. Tailored Value Ratings result in rankings that vary according to each person’s preferences. This method is more accurate, relevant, and practical; it rejects assumptions of fixed and universal consumer preferences.

The one-size-fits-all approach persists largely as a benefit for the rated entities; they capitalize on their ranking for promotional purposes. For consumers, there is little or no use for one-size-fits-all. With tailored Value Ratings, we put each consumer and his or her unique Preference Profile first. Value Ratings sidestep business models that put sellers’ interests ahead of what is important to each consumer.