I promised in a recent post called The Tyranny of the Datum to write about some guiding standards for appropriate data usage, in the spirit of Isaac Asimov’s Three Laws of Robotics, and I will do that here today. Before I get to that, though, I’d like to briefly discuss–in a general sense–what I see as a fairly blatant disproportionality in evaluative rigor as applied to various facets of K-12 education.
A frequent complaint about the educational “status quo” has been that there is too little evaluative rigor. This complaint has been raised by those who note that high percentages of teachers are rated highly proficient by their appraisers even when comparatively lower percentages of students pass their required state tests. The suggestion is that the evaluation mechanism is broken and isn’t therefore adequately identifying strong versus weak teachers. The blame for this is generally placed at the feet of teacher unions, which are blamed by reformers on the left and right for protecting bad teachers and, as a consequence, compromising the teaching profession and harming students. The proposed remedy is increased evaluative rigor in the form of appraisals tied–in some percentage or another–to student test scores. This cause has been embraced by the federal department of education, and such appraisal requirements are embedded in waivers from No Child Left Behind. My state of Texas–a recent recipient of a federal NCLB waiver–will begin piloting a new teacher appraisal system that ties student test scores to their teachers’ job performance appraisals soon.
Another area where we hear calls for more rigorous evaluative protocols is in teacher preparation. Many of the people who have tasked themselves with improving the American K-12 public education system (deemed failing as a result of its students’ performance on international standardized tests primarily, and, secondarily, due to the high percentage of graduates requiring remediation when they enter post-secondary institutions) have identified teacher training as an area of weakness. Individuals and organizations have devised sometimes controversial rubrics for judging these programs and have released their results to the public. Others have suggested tying student test scores not only to their teachers but also to the teacher prep programs that trained their teachers. This is, to steal a quote from my book Test-and-Punish, like the Six Degrees of Kevin Bacon game, except that it’s Six Degrees of Student Test Scores.
These are only two examples of the increased evaluative rigor that has become, ahem, de rigueur. I could also point to increased fiscal scrutiny facing public schools–in Texas, for example, we have the half-decade-old FAST reporting system from the Comptroller’s office. It rates schools based on a comparison of their spending to their test scores. (The test scores appear to be the linchpin for all comparisons and quality conclusions.)
I’m not complaining about evaluative rigor. As a person who values public education as our nation’s greatest equalizer–the most effective and broadly-implemented social structure we have that mixes our classes and races and let’s them not only get to know one another but also, in theory at least, let’s them struggle together against a common foe of ignorance, as well as against one another for preeminence in the classroom–I want public education to be great. I wish the system that I’ve given my adult life to were above reproach, but it isn’t. Adequate evaluation is vital. I have concerns that we are over-extending and misusing student test scores fairly pathologically, but that doesn’t mean that I oppose improvements to our procedures for judging schools, administrators, teachers, and teacher prep programs. Certainly improvements are possible and necessary.
My concern, then, isn’t that I think we should evaluate less in K-12 ed. My concern is that many prominent voices appear to be extremely disproportionate in their calls for evaluative rigor. A few brief examples are in order.
1. Louisiana officials have resisted subjecting voucher schools to the same test-based accountability systems that traditional schools in that state are held to. The rationale for this distinction was that the free market would be the accountability, that parents would choose to send their kids to the better schools and the worse schools would close for lack of business. But, of course, in a system of choice, the public schools are also a choice, so you would think the free-market-in-lieu-of-test-based-accountability argument would extend to those schools as well. It did not. Rigorous accountability for purposes of evaluation remained for the traditional public schools. The great wrong here is not the accountability system (although it most likely had its flaws). The big problem was disproportionate evaluative rigor.
2. Charter school application processes have in many states been rubber stamp parties. Ohio–where the school choice juggernaut known as the Fordham Foundation resides–has some of the nation’s most lax charter authorizing practices, and also some of the nation’s most underperforming charter schools. In other states–my own included–it has come to light that many charter school applications are literally copied from other charter schools’ applications. This floors me: schools that are introducing themselves to the public as potential academic caretakers of our children commit the capital academic offense of copying right out of the gate, on their very first assignment. And yet, for all the hue and cry over a lack of rigorous evaluation in our teacher prep programs, there is very little noise when it comes to evaluative rigor in charter authorizing. The disproportionality is the thing.
3. Charter policing is also often lax. This is perhaps to be expected, as charters were initially devised as vehicles for innovation (as opposed to profitability, which today sometimes takes precedence), and were therefore freed from many of the regulations that apply to traditional schools. However, the deregulation that was intended to create instructional advantages for children has been repurposed to provide competitive advantages for businesspeople and investors. As a result, the number of charter schools with major financial and academic performance problems that actually get closed down is surprisingly small. Choice as an inherent good has overshadowed any calls for quality control in the school choice movement, while the same people advocating for school choice have been relentless in their quest for ever-higher bars that traditional schools must clear.
There are certainly many more examples of disproportionality in evaluative rigor as it is applied to aspects of traditional K-12 education versus aspects of reform K-12 priorities, but I don’t have time to cover them all this morning.
Somewhat relatedly, there appear to me to be fairly low standards for determining how data should be used in education. Here are my suggested Three Laws of Data:
1. Data belongs to the human who generated it.
2. Humans should always be informed when their data is being collected and analyzed.
3. If data is being aggregated for the purposes of creating a value meant to reflect positively or negatively on a human, that human should have ongoing access to the aggregated data and the formula that will result in the final value, so that the human has an informed opportunity to rectify any performance-related issues that might compromise his or her final score.
I’m sure there are other considerations. This is my first swipe at data rules. I’m also sure there are complex documents on this topic from the AERA and the testing companies, but what I’m advocating is a brief and easy-to-understand set of ironclad rules for regular people to be able to cling to in our data-informed future. Also, I’m not strictly thinking about educational data. I’m also thinking about credit scores, Google advertising data collection, and so forth.
Whether we speak of traditional education’s totems or the icons of nouveau reform, the rigor of our evaluations should be the same. We should be particularly mindful to carefully vet standards for our use of data, given how central data will be (and already is) in this field.