According to some interesting new research, the portrayal of strong female characters may be more important than plot content (including sex, violence, and even sexual violence) when it comes to shaping viewer attitudes to women.
Past research has been inconsistent regarding the effects of sexually violent media on viewer’s hostile attitudes toward women. Much of the previous literature has conflated possible variables such as sexually violent content with depictions of women as subservient
The submissive characters often reflect a negative gender bias that women and men find distasteful. This outweighed the sexual violence itself, giving credence to what Ferguson calls the “Buffy Effect”—named after the popular television show Buffy the Vampire Slayer and its strong lead female character.
“Although sexual and violent content tends to get a lot of attention, I was surprised by how little impact such content had on attitudes toward women. Instead it seems to be portrayals of women themselves, positive or negative that have the most impact, irrespective of objectionable content. In focusing so much on violence and sex, we may have been focusing on the wrong things,” Ferguson said.
“While it is commonly assumed that viewing sexually violent TV involving women causes men to think negatively of women, the results of this carefully designed study demonstrate that they do so only when women are portrayed as weak or submissive,” added Journal of Communication editor and University of Washington Professor Malcolm Parks. “Positive depictions of women challenge negative stereotypes even when the content includes sexuality and violence. In this way Ferguson reminds us that viewers often process popular media portrayals in more subtle ways than critics of all political stripes give them credit for.
Proving once again what we all (or at least all of us of a certain age) already knew: Buffy is so much more awesome than Twilight ever could be.
[Author’s note: It’s possible that the entire point of this post was just an excuse to put up Jo Chen’s Buffy Illustration. I’m okay with that.]
Reblogged this on Adventures and Musings of a Hedgewitch.
It’s also a great excuse for me to post a link to this Buffy vs. Twilight picture http://slinkywhippetslandoflols.tumblr.com/post/25969099089/twilight-vs-buffy
They showed a convenience sample of 150 undergrads one of six TV episodes and had them fill out questionnaires afterwards. Since the students didn’t fill out questionnaires *beforehand*, the study didn’t actually measure any changes in depression, anxiety, or sexist attitudes. So this is a lousy basis for an inductive generalization, and an even worse basis for making causal claims. I’d suggest that relying on methodologically bad research to support our views is even worse than relying on no research whatsoever.
Also, whoever wrote the tables and charts has some weird ideas about how to present data: the order of the categories is inconsistent, there are no error bars, etc.
Dan, I’m not that familiar with research on this topic, but I think the study has to be taken in context. Previous work seemed to suggest that after watching depictions of violence, particularly sexual violence, viewers would report anxiety, unease, and in many cases somewhat sexist attitudes. This study suggests that you can get variation in these results by varying how women are portrayed – even if the level of violence remains roughly the same.
That seems fairly interesting. The methodology of the whole enterprise may well be questionable, sure. But this particular study’s methodology doesn’t seem aberrant when compared to similar ones.
Anyway, I mostly I put this up because I wanted to talk about Buffy.
This is my favorite Buffy vs. Twlight piece: http://www.youtube.com/watch?v=RZwM3GvaTRM
It’s a really well done splicing of scenes from the two.
Hi Dan,
I’m not so down on the methodology.
Taking your points in turn:
1) Convenience sample…well, this is typical, yes? I wouldn’t want to use this a strong basis for generalizing to, say, older people (due to possible generation effects). But it hardly seems meaningless.
2) Only questionnaires afterwards, so no baseline. But this is a between subjects, not within subjects study. Given that the students were randomly assigned a viewing, there’s no particular reason to think that any significant differences in other qualities (e.g., depression levels) were due to preexisting differences. That’s the point. Note the testing for differences in depression due to gender (against, between subjects).
Thus, for example,
So, there was no difference in anxiety levels between men and women as a whole, but women who watched the sexual violent + crappy female characters exhibited more anxiety than the women who watched the other show.
There could be a confounding factor unnoted, or it could just have been chance, but the study is, in fact, design to detect the effects of watching the show and it’s a reasonable design.
(Within subject studies have their own issues and benefits, of course.)
So, this study is perfectly fine for making inductive generalizaiton (though I’d be cautious about generalizing to a markedly different population) and is perfectly ok for supporting causal claims. One nice feature is that they didn’t validate all their hypotheses e.g., the effect on depression.
Yes, we shouldn’t use methodological suspect or crappy work, but we should be clear on what constitutes methodologically problematic work.
Plus, the author was entirely judicious in their interpretation of the results:
So, a solid piece of work, but just one study.
Stacey @5: thanks, that is such a great remix!
I meant to include a link to the actual study earlier, but forgot to do so. Sorry about that.
magicalersatz – I agree that it’s an interesting topic, and that the methodology of this particular study is similar to the methodologies of other studies on the topic. My criticisms — especially of small samples and of convenience samples — also apply to these other studies. For problems with convenience samples made up of undergrads, see Henrich, Heine, and Norenzayan, “The weirdest people in the world?,” Behavioral and Brain Sciences, 33 no. 2-3 (June 2010), pp 61-83. Another, related general problem in many psychological studies includes “flexible” data collection and analysis; see Simmons, Nelson, and Simonsohn, “False-positive psychology”, Psychological Science, 22 no. 11 (Nov 2011), pp 1359-66.
Bijan Parsia –
On convenience sampling: For instance, the researcher notes that almost all of his subjects are Hispanic (he’s at Texas A&M International), but he doesn’t do anything at all to control for racial effects. For all we know, all of the non-Hispanic students ended up watching the same episode together, and a considerable chunk of the variation would be explained by race-linked differences in sexist attitudes. Notably, he feels free to speculate about race-linked differences in the conclusion: “Particularly among Latino men, for whom machismo often remains an influential cultural phenomenon, the depiction of strong females may threaten traditional gender roles.”
On the lack of a baseline: First, random subsamples will only tend to be representative of the population when you’re working with sufficiently large numbers. He had about 25 people per subsample. (Cf. figure 2 in the paper by Simmons, Nelson, and Simonsohn.) You’re right that there’s no particular reason to think there were any preexisting significant differences between the subsamples. Without establishing the baseline, there’s also no particular reason to think there weren’t any preexisting significant differences. You’re simply wrong when you say that “the study is, in fact, designed to detect the effects of watching the show.” For every factor other than gender, it relies on randomization over a small sample taken from an unrepresentative convenient sampling frame.
Note that, in the conclusion, he suggests future research should use a “pre/post design.” If this is a good idea for future research, why wasn’t it a good for this research?
On interpreting the results: I’ll grant that the author included some (in my view, basically pro forma) gestures in the direction of replicability and looking for confounders. First, even he identifies one potentially major confounder, but did nothing to control for it here. Second, scientists rarely bother with straight-up replications, because funders don’t like to fund them and journals don’t like to publish them. Third, all of the media coverage of this piece uses the same, highly problematic language found in the Science Codex piece linked in the OP: “strong female portrayals eliminate negative effects of violent media.” I don’t want to get into an argument over whether the researcher bears any responsibility for crappy science journalism, but the crappy science journalism is certainly also a problem here, and I do have a problem with academics promulgating such crappy science journalism.
Hi Dan,
First, can I ask if you’ve designed and executed an experiment on humans? I see you do philosophy of science, but I’m curious as to your practice. (Obviously, if you haven’t, that’s nothing against your argument, but it would help me if you clarified this point, if only to keep me from pointless speculation :)) To be upfront, your comments seem to be to be similar to reviewer comments which to me seem to be uninformed…picking on convenience sampling and complaining about size without going into what aspect of the e.g., statistical tests are affected by it I treat, rightly or wrongly, as tells. (For the record, I have. I recently did one using MSc students. My sample was smaller. I think it was fine and useful :))
This isn’t about convenience sampling, is it? The assignment is random, not convenience. That is, if we set the population to be the 150 students, he doesn’t convenience sample within that. He random samples. Which, I think is how it should be!
Your “for all we know” isn’t really very telling, is it? For all that we know he made up the whole thing. The author was aware enough to speculate on racial issues, so it seems unlikely that they would have failed to notice such an unusual configuration. A quick glance at the demographics shows that “90.5% of students are minorities, including Hispanic, African American and Asian students.” So, it’s really unlikely that there’d be an all white section (90% of 150 = 135, so it’s impossible if the sample reflects the school demographics). I’m sure you can work out the probabilities yourself of getting 1/6 white marbles even if the back is half brown, half white. This isn’t a real concern. But hey! Send the author an email and ask.
This is the sort of thing which makes me feel that your objections are somewhat rote.
(I agree that it would have been nice for the author to be determinate about the racial composition instead of using qualitative terms like “primarily” and “majority”.)
I wish you would use more neutral description. “Feel free to speculate” goes in with your earlier disparaging words about the methodology.
So, what’s wrong with this? They have observed a phenomenon and are offering possible explanations. The full quote:
And a bit later:
It all seems fine, responsible, and appropriate.
Since the author does not generalize to the population even of Hispanic college students, I think it’s fine. Again, I think it helps to think of the overall sample as a population under study, at least at first.
It doesn’t seem relevant. BTW, I skimmed through that paper and it seems that Ferguson’s would score pretty well. If you look at the requirements for authors, 1 is met, 2 is met (25 per cell!), I’ve no reason to doubt 3 and 4, etc.
You are still confusing within and between subjects, I think. Also, that’s the point of random subsampling and why overmanipulating that is a mistake.
I really want to be snarky here.
I’m having trouble understanding this, though, again, it sound generic. Randomization over the whole initial sample generates the subsamples. The subsamples are sufficiently large that fairly standard tests can yield significance given the effect sizes. Not too surprising given the phenomena under test. Ferguson doesn’t make any unreasonable generalizations, so I’m not particularly bothered by the sampling frame. I still don’t know why you are, unless you are starting from a position that convenience samples, esp. of college students, is inherently illegitimate. I don’t know what to say to that. It’s clearly a reasonable activity although, as with all empirical methods, you have to be very careful about its limitations.
I can think of a ton of reasons off the top of my head: for example, pre/post would have involved administering the e.g., anxiety assessment before and after the watching which would have lengthened an already lengthy session (1 hr!) and potentially introduced a number of effects including measurement consciousness, activating anxiety, etc. I think this is a reasonable first study design which provides reasonably preliminary evidence that strong women characters might have this protective effect. I.e., that it’s worth studying more. It’s definitely worth trying to control for this effect in similar studies.
Eh. The second to last paragraph is pro forma, though noting the fact that they were targeting clinical relevant stuff was a useful reminder. The prior pargraph is fine. What, there can be no pro forma stuff in a study? The inclusion of boilerplate is evidence for the poorness of the study?
BTW, which confounder? I didn’t see any. Do you mean that matching may have been messed up? Or do you mean race?
There’s also danger in overreplication…statistics will have it’s due. I don’t thing an exact replication would be interesting anyway.
With respect to your concerns about the coverage, fair enough. But they have nothing to do with the quality of the study and the methodology employed. So they are sort of irrelevant, except maybe helping to explain your animosity toward the article (esp. your linking the scientist and the journalism). (But, I’ve been there!)
Actually, the thing I’d like to know is whether any of the participants had seen any of the shows before (either the series or the particular episodes).
I don’t have further time for a detailed back-and-forth, so I’ll just say this and bow out of the discussion:
You’re right that most of my criticisms are “generic,” if by that you mean “not particular to this study.” Bad methodology — including an over-reliance on convenient, unrepresentative samples, small samples, and oversimplified experimental designs that don’t control for obvious confounders — is a widespread problem in many fields of behavioral science.
Oh, that’s too bad. Perhaps later, off line?
I’m really sad that you didn’t take a minute to address your experience and, esp., the okness of a non pre-post study.
I agree that all the generic criticisms are widespread, but that doesn’t mean that they apply in this case. Over reliance on generic criticism (“But Hume showed induction is false”) is *also* a problem.
In this case, the convenience sample was what it was and was fine (invokes threats to external validity). The subsamples were representative and adequately sized for the effects (and — good sign — not all the hypotheses were validated!). All the obvious confounders are plausibly accounted for (race wasn’t a confounder, afaict).
It’s a study with limitations. But Ferguson looked at the world and saw something interesting. Worth publishing and considering and further work.
Sorry sorry, to follow up again. But my last sentence is exactly the point. Some evidence is better than no evidence a good deal of the time. (Too many studies, as I pointed out, can spoil evidence, but too few means lack of evidence.)
In this case, the result are interesting and potentially of great clinical significance. If having strong female characters can overcome the sexual violence aspects, then that’s great news. It seems MUCH easier to incorporate various kinds of female strength than to reduce the amount of sexualized violence in e.g., films and TV.
I like it when the scientific method is put to use to analyze cultural artifacts–especially when the findings are as interesting and significant as this. Thanks for sharing!
[…] noemen dit al het Buffy effect, naar de sterke vrouwelijke hoofdpersoon van de serie Buffy the Vampire Slayer. Buffy neemt geen […]
So, I wrote Ferguson, and he gave a (helpful, IMHO) reply (of course, the reply confirms what I thought :)). I asked him specifically 1) about race as a confounder and 2) about prior viewing. His reply:
Oy, I screwed up the blockquoting early on. Sigh.
[…] The Buffy Effect A study seems to show that strong women in TV shows, like, say, Buffy, may be more important than plot content (including sex, violence, and even sexual violence) when it comes to shaping viewer attitudes to women. Um, duh? Still, nice to see data making the case. Share this:TwitterFacebookMoreTumblrLike this:LikeBe the first to like this. Filed under Weekly Linkroll and tagged Buffy the Vampire Slayer, Charles Lindbergh, cool science, Ellis Nelson, Eric Idle, Galaxy Song, Gawker, Gratitude every day, Monty Python, Nils Pickert, parenting, Visible and Real, wind map, women, works in progress | Leave a comment […]
[…] have all the mainstream superheroines for kids gone and how can we bring them back? Share […]
[…] The Buffy Effect (feministphilosophers.wordpress.com) […]
And now look at how the Buffy Effect is being echoed with Hunger Games, Revolution, and numerous other shows and movies portraying female archers: http://missingmarble.com/2013/02/04/archery-and-the-buffy-effect/
[…] The Buffy Effect (feministphilosophers.wordpress.com) […]