Working with a range of evidence
Presently, there is a lot of attention and investment in evidence-based policymaking including the federal, bipartisan Commission on Evidence-Based Policymaking; the Pew Charitable Trusts reports and toolkits on How States Engage in Evidence-Based Policymaking; and the Laura and John Arnold Foundation supported Evidence-Based Policymaking Collaborative. Many of these initiatives focus on "rigorous evaluation" -- meaning randomized controlled trials (RCT) or other experimental models -- as the best kind of evidence to use. However, this body of evidence is often insufficient for the specific kinds of questions that governments can have when making a decision about policy. In this guide, we propose an approach to support governments in incorporating additional types of evidence into their decision making.
One way to understand a range of evidence is to use Nesta’s Standards of Evidence to frame and define evidence. The British nonprofit Nesta works to inject positive change into the way UK government policies and programs function. One of their key areas of work has been developing What Works Centres, organizations that collect, commission, synthesize, and disseminate evidence about intervention effectiveness across a range of social impact areas. Nesta developed the following Standards of Evidence to provide a useful way to document how different kinds of information demonstrate different levels of evidence.
The importance of potentially biased information
One challenge presented by an evidence classification that acknowledges the value of non-scientific forms of evidence is to figure out how to balance all of the pieces together when you use them. To understand how to do this, we have to think about how to position potentially biased information in decision making.
When we look at information which has not been rigorously evaluated, we accept that this information might be biased. The whole reason we conduct rigorous evaluation is to remove bias from our observations. We conduct structured trials to produce results that are consistent and generalizable. If a finding is unbiased, this accuracy — this lack of bias — can be proved because any independent tester can reproduce it when they follow an identical process.
You can look at potentially biased information as just information that is earlier on in its path to refinement, replication, and verified accuracy. A piece of information based on any one individual’s observations is likely to be biased because it is coming from just one source. Bias is reduced by gathering together lots of independent sources of information and figuring out its most common attributes.
Maybe the biased information makes a claim that is too large — it says something is true under all conditions, when it is really only true under certain conditions, or that an effect of an intervention is bigger than its true average, or that an effect lasts over time when it does not. Perhaps changing the claim to be more specific — critically reviewing the information to see if specific conditions must be met to achieve the outcomes — would make the claim more accurate.
If you tested your claim, found it did not hold up in some cases, changed it, tested it again, until your claim repeatedly held up, then you could demonstrate that the claim was no longer biased, but accurate. This process of testing, improving, testing, and validating is what we get from structured, scientific study.
With individual pieces of information -- credible hypotheses, case studies, expert opinion -- you cannot assume that kind of accuracy. At the same time, these early observations are foundational to any replicable, scientific observations. We will never get anywhere if we do not start with these pieces of information, even if they might be biased.
Combining levels of evidence to get the most detailed information for your city
It is especially important to accept different forms of evidence when you face questions that have not yet been tested by independent studies. Some of the most important questions relate to things that are very specific to your situation. You face a particular combination of issues, actors, trends, and resources that is ultimately different from every other place in the world.
This is important, because as policy practitioners, and not scientists, you care more about your specific case than a hypothetical average, general case. You care less about the total likely range of impacts from a policy than how this single implementation will work. You want information about policies that will provide you with the best possible prediction about how they will work in your context.
By putting different kinds of evidence into the same framework, Nesta’s Standards of Evidence imply that when you cannot get rigorously tested, experimentally verified information, it is appropriate to work with what we do have. This is relevant for the municipal implementation context, because no study pulled from a literature review will account for all of the variables you face in your own context.
In the absence of having a study that predicts how an intervention works in your specific context, you will need non-scientific information to try to fill in those gaps. You will need to gather stories from residents, staff members, and other stakeholders that sit on Levels 1 or 2 of the standards of evidence.
Some of the kinds of information you will probably get more easily from people than studies include:
- how people in cities like yours reacted to an intervention in public forums
- how people who implemented an intervention in a city like yours figured out how to get it passed
- how the city communicated publicly about the apparent benefits and drawbacks of the intervention
- the political acceptability of the intervention
You can remain simultaneously aware that the information you are gathering through individual stories is biased -- not generalizable, informed by specific perspectives -- while also recognizing that having some information about something important is better than no information. Evidence from the first two evidence levels are essential building blocks of knowledge and perfectly useful when you cannot get detailed unbiased knowledge.
You have to start somewhere, and you should always start with evidence you have about your very specific conditions that you know are important for success.