But this week one of Gates' bigger data projects, the Measures of Effective Teaching study, was found wanting:
The recommendations in the final Measures of Effective Teaching work products may not be supported by the project's hard data, the National Education Policy Center contends in a review of the project.
The review, released last week, was written by Jesse Rothstein, of the University of California, Berkeley, and William Mathis, of the University of Colorado at Boulder. The NEPC has taken issue with several prior work products from the MET study.
In the critique, the scholars take aim at the study's randomization component, the basis for the MET report's headline finding that projections based on the three measures studied, which include "value added" test-score analysis, seemed to be quite accurate overall. But Rothstein and Mathis note that there was a high degree of noncompliance with the randomization, and also suggest that teachers of certain students appear more likely to have dropped out of the study. (Rothstein made a similar point in Education Week's story on the final MET results.)
The scholars also say that none of the three main measures studied—student surveys, value-added test-score growth, and observations of teachers—was particularly predictive of how teachers' students would do on the alternative, "conceptually demanding" tasks. That's potentially worrisome, since the tests being designed to measure the Common Core State Standards are purportedly more in line with such tasks. "There is evidently a dimension of effectiveness that affects the conceptually demanding tests that is not well captured by any of the measures examined by the MET project," the authors write.
The scholars also question one of the very premises of the MET study: its use of growth in student test scores as the baseline standard for comparing the impact of all the measures it tested.
"It is quite possible that test-score gains are misleading about teachers' effectiveness on other dimensions that may be equally or more important," the paper states.
Bruce Baker also found this Gates Foundation MET study wanting, noting:
I’ve written several previous posts explaining the absurdity of the general framework of this research which assumes that the “true indicator of teacher effectiveness” is the following year value-added score. That is, the validity of all other indicators of teacher effectiveness is measured by their correlation to the following year value added (as well as value-added when estimated to alternative tests – with less emphasis on this). Thus, the researchers find – to no freakin’ surprise – that prior year value added is, among all measures, the best predictor of itself a year later. Wow – that’s a revelation!
As a result, any weighting scheme must include a healthy dose of value-added. But, because their “strongest” predictor of itself analysis put too much weight on VAM to be politically palatable, they decided to balance the weighting by considering year to year reliability (regardless of validity).
The hypocrisy of their circular validity test is best revealed in this quote from the study:
Teaching is too complex for any single measure of performance to capture it accurately.But apparently the validity of any/all other measures can be assessed by the correlation with a single measure (VAM itself)!?????
This was an ideological study with a political agenda.
The Gates Foundation people go into the study assuming test scores adequately measure teacher performance and value-added measurements accurately measure test scores.
They then seek to prove those findings with the data and wrap themselves into pretzels to do so.
The newspapers report the findings, even though the findings are horse hockey, and then we get politicians (or union leaders) pointing to these findings as sufficient justification for the new value-added or growth model teacher evaluation systems.
As always, the fix is in.
Findings first, data and research after.