When Reason Isn't Enough: Teaching Evaluations
I work at Carleton University as a professor. Here it is mandatory that we administer surveys to our students to evaluate our quality of our teaching in each class. The scores that these tests generate are used to evaluate a good deal of a professor's job performance when it comes to things like tenure and getting certain teaching awards.
I get a free newspaper called the CUAT Bulletin*, which claims to be "Canada's voice for academics." The most recent issue (January 2012, 59:01, page A2) features a commentary written by eight people entitled "Student Surveys a Poor Measure of Teaching Competence." I read it with interest.
The paper is well-reasoned. Here are its points:
- It makes an analogy with doctors and their patients, and how we might assume that the patient might poorly evaluate a doctor because of things the doctor did that were painful but necessary. They describe what kind of negative effect this incentive effect would have on medicine.
- Struggling students will give low assessment scores even if the teacher is doing a good job.
- Student assessments measure an emotional disposition before the full effect of the course is known to the student.
- They conclude that the measurement is arbitrary and should not be used.
We live in a culture (or at least I do) where reason is lauded, and thought to be the best way to think and come up with beliefs. Usually I'm on the side of reason. I'm on the side of reason when the only alternative is irrationality: wishful thinking, biased thinking, selfish thinking, etc.
However, when reason is contrasted with evidence, I'm afraid I will almost always side with the evidence.
And evidence is exactly what this commentary is missing. Not a shred of evidence is provided for the thesis of the commentary, nor any of its main points.
But first, let's talk about measurement for a moment. Is it possible that someone likes beer and also would not consider having sex on the first date? Of course it is. But that does not mean that asking someone if they like the taste of beer isn't predictive of whether or not they might consider sex on the first date. In fact, if someone likes the taste of beer, they are 60% more likely to want sex on the first date.**
The most important thing in measurement is how accurate the measurement is. If the measurement is accurate, it really doesn't matter whether the measurement "makes sense." In this case, you can reason your way out of thinking that beer liking predicts first date sex having, but you'd be reasoning yourself out of something that has some truth to it.
You can also predict pretty well how good a chess player someone is by how many chess books they have in their house. This is true even though someone who sucks at chess might have those books anyway. (I can't remember where I read this.)
Which brings me back to teaching evaluations through student questionnaires. All of the points made in the commentary basically amount to nothing if the student evaluation is accurate. And even though educational research is not my specialty, it took me about 20 seconds of web searching to find some evidence that is relevant to the issue.
There are limitations to student evaluations: they are bad at evaluating course goals, content, design, materials, and evaluation of student work. However, they do enable the detection of patterns in teaching development. Was that so hard?
I'm not going to make this a debate about the quality of teaching evaluations; that's not my point. My point is that eight people wrote this article with a title that begs for evidence, and provided none of it. How can eight scholars not have thought to look at the evidence? Either they didn't think to look for it, or they (falsely) imagined there would not be any. Maybe they thought that other scholars reading the commentary would not be persuaded by evidence. That's insulting.
There are 100 plausible ideas for every one that's true.
Demand evidence. Reason is often not enough.
* What does CUAT mean? Good question. I found their website and the unpacking of the abbreviation does not appear to be on the front page of the website. I have no idea what it stands for.
** This is from the terrific Ok Cupid blog: http://blog.okcupid.com/index.php/the-best-questions-for-first-dates/
Pictured: Students in Ecuador. From Wikimedia Commons.