By Feifei Wang, The Chinese University of Hong Kong
“Evidence-based,” a currently popular concept, assumes that identifying the high-quality interventions with valid positive results will enhance educational outcomes on a widespread scale. Clearinghouses (CHs) push this process forward by setting their chosen scientific criteria, evaluating studies of the required quality, synthesizing the study results, and proposing recommendations. To probe into the consistency of the meanings of “evidence-based” in different CHs, Cook and colleagues recently examined 12 educational clearinghouses to (1) compare their effectiveness criteria, (2) estimate how consistently they evaluate the same program, and (3) analyze why their evaluations differ.
How variable are CHs in their effectiveness criteria? All the CHs value randomized control trials (RCT) as the preferred experimental design, but they vary in how they test whether an RCT is well-implemented enough to deserve the highest study-quality ranking. Quasi-experimental designs were treated more variably than RCTs based on separate standards for different categories. Additionally, different CHs place different emphases in criteria to ancillary causal factors such as independent replication and long-lasting intervention effects.
How consistently do CHs rate the same program? Among the 1359 programs analyzed by 10 CHs, 83% of them were rated by a single CH. Of those programs assessed by more than one CH, consistency in similar ratings was achieved for only about 30% of these programs. There is a high level of inconsistency in CH ratings of the same program.
Why do the CH effectiveness ratings of the same program differ? Four explanations are given. First, CHs have different inclusion criteria. Second, they examine different versions of the same program. Third, they examine different outcomes of the same program. Fourth, they apply different standards of evidence to single studies and/or to the study synthesis they perform.
The authors conclude that because of the inconsistency of effectiveness criteria, implementations, and recommendations across CHs, the program effectiveness ratings should be interpreted with caution as a guide to improve educational practice, and the current identification of “evidence-based” interventions seems to be still “more of a policy aspiration than a reliable research practice.”