What do course evaluations evaluate?

It would be interesting if course evaluations evaluated teaching effectiveness. A low evaluation would mean the students did not learn much, while a high one indicated a very good teacher. For this to happen, it seems students need to be able to tell if they learned much.
Given what we know now about self-knowledge,** we shouldn’t expect students to be able to tell how effective some teaching really is. And though the evidence is early and fairly small, it looks as though students are NOT very good at assessing how well they are taught.

See the CHE Article:

In an experiment students heard the same lecture—on why calico cats tend to be female—from two instructors, one fluent and engaging, the other halting and stiff. Unexpectedly, both groups of students scored equally well on a test of the material, even though the students with the better lecturer thought they’d learned more.

** easy to read references: Thinking fast and slow, by Kahneman; The Invisible Gorilla, by Simons and Chabris

10 thoughts on “What do course evaluations evaluate?

  1. Although I think that it is an absolute travesty that so many universities use student evaluations as a measure of teaching effectiveness, the comments on the article are pretty good about the limitations of this study.

    For example:

    1. Students might be willing to ‘try really hard’ to compensate for a terrible lecture when it is only one minute long versus three hour long lectures a week for 10/15 weeks.

    2. Students know they’re memory is being tested so they are motivated to try harder.

    3. Recall of material was tested very soon after the lecture but this is not how university classes work. Would there be differences that show up one week later (between engaging and boring lectures)? What about two weeks? A month?

    4. Both students watched a video. Unless they’re taking an online class students do not consume lectures this way. Videos encourage a different kind of engagement than live performances do.

    So, although I agree that it is a terrible (and frankly stupid) inference from ‘good evaluations’ to ‘good teaching’ I don’t think that this is the study that proves that.

  2. As much as I am uncomfortable with being/seeming pedantic, and as much as I have concerns about the cited experiments, I have to say that the first comment is troubling in a number of ways.

    1. The article does not support any inference that the students (all four sets) were ‘trying hard’ to any extent. Indeed, this would seem to be counter to the results of the experiment.

    2. While we are all liable to typos, I expect academics to be able to distinguish between ‘their’ and ‘they’re.’

    3. Setting punctuation aside, how would differences in student memory retention be relevant to the ‘claims’ the study makes about evaluations?

    4. Again, at the risk of being pedantic, I would point out that, while we might speak of ‘both’ sets of students in each of the two experiments, there were four groups of students.

    The observation that watching videos is a different experience from hearing/seeing a live ‘performance’ is interesting, but offers no reason for dismissing the apparent results of the experiments.

  3. Stop Oppression of Women in the Third World


    We are a new organization that is committed to stopping the oppression of women in the third world countries.

    Women in the Third World suffer considerably from the following problems:

    -daily domestic violence

    -brutal rape and torture

    -patriarchal culture without any moderation

    -abusive men

    -honor killings

    -forced poverty and are unable to free themselves since women are considered as property, not human beings, in many third world countries

    We are committed to trying to END this oppression. One of our goals is to create a political lobby or organization that lobbies for visas for single or divorced women from third world countries, so that we can bring them to the West and help them escape the abuse, rape, poverty, and violence of the patriarchal system in most third world countries.

    If you are interesting in helping out by donating, or web designing, or campaigning for us, please drop me a line at sowtw1987 AT hotmail DOT com

  4. Hi Chris, a few thoughts

    1. The students were told it was a memory experiment. They were then shown a video. As a subject, what would you do? In a very short (1 minute) scenario, it is easy to stick with a bad lecture and put in the extra work to be fully engaged with it (especially when you know you are part of a study and that your memory will be tested). But this is not ecologically valid. In a classroom setting a student does not have these incentives (of being in a study), a student sits for at least an hour with the lecture, and this happens several times a week, for weeks on end. There is no way to know, from this study, whether engaging or non-engaging lectures result in equal performance when the lectures are…lecture-length (50 minutes to 2 hours or so).

    The point here was that just about anyone can sit through a bad minute of non-engaging material but what about an hour? What about three times a week for many weeks? It’s likely that lecture style makes a difference on those time frames.

    2. This is beyond pedantic.

    3. The claims in the study were that students learned just as much in both the engaging and non-engaging lectures. This was backed up by the claim that all students did about as well on the exam given to them soon after watching the videos. Since what we are trying to figure out is whether or not students really *do* learn just as well, it stands to reason that we would want them to be able to recall information for quite a bit longer than this. This study says nothing about the effects of teaching style on medium and long term recall only immediate recall.

    4. There’s no risk here, you’re being pedantic.

  5. I take the central question to be whether the students could accurately evaluate whether they learned much from the less good teacher. From this perspective, a lot of the details did not affect the conclusion that they didn’t. Thus, how the students ended up getting the same scores is not particularly important, I think, because they did not believe they would.

    Also, the hypothesis that the ‘badly taught’ students could not effectively assess what they learned had an initial plausibility that makes a difference. There are very good reasons for thinking, for example, that we tend to wrongly evaluate our capacity to take in all the visually available details there are before us, how accurate our memories are, etc.

    Notice that I did say the evidence is slight. But it is aimed at one of the most crucial presumptions behind course evaluations, which is the students ability to assess how much they learn.

  6. I agree with annejjacobson that the central question the study tries to answer is an epistemological one. That is to say, it seems that the study makes a fairly uncontentious claim that student’s can’t really assess how much they learned (uncontentious in that it’s difficult for anyone to know how much they know). And I’m totally on board with the study’s implication that student evaluations ought not be the end all of teacher evaluations.

    However, I think that ejrd’s point re: medium and long term recall highlights a glaring problem with the study: it conflates immediate recall with learning. I won’t argue for any particular definition of learning here except to say that it seems that whatever learning is, it can’t simply be immediate recall.

  7. A couple of points:

    – First, is it accurate to say that one of the most crucial presumptions behind student evaluations is that students are able to assess how much they’ve learned?

    I’ve got my own course evaluations in front of me right now. There are 20 questions on those evaluations. Really, only one of those 20 questions (regarding the manner in which concepts are presented) asks students to assess how much they’ve learned. The other 19 are completely unrelated – everything from asking the student how interested I am in teaching to asking whether help is available outside class to asking whether I provided adequate feedback on assignments.

    – Second, post #3 (from SOWTW) looks like it made it past your spam filter.

  8. I guess what I’m getting at is that teaching effectiveness, narrowly construed in terms of learning (as measured by test performance, etc.) just might not be all that central to what course evaluations are up to. And that might be perfectly okay.

  9. Matt: I think it’s actually very difficult to say, since there are many different perspectives on the situation. Must think…

  10. Hi, erjd.

    Sorry to be a pedant, in the derogatory sense. I hope my original comment indicated that the perception of pedantry might be somewhat … a matter of perception.

Comments are closed.