I've finally got around to writing a few more thoughts on what's turned into a healthy and productive conversation on pre-testing. You can read the original post here and the response from Millward Brown’s Nigel Hollis here. Be sure to read the comments too.
I should offer a bit of context first. Just as Nigel’s point of reference is Link, mine largely is not. My pre-testing experiences of the past few years have been with other suppliers, some of the major players and some smaller ones. I don’t have recent experience with Link because due to shifting client-supplier relationships I haven’t experienced pre-testing with MB in 3 ½ years. So we may be talking about different things. Nigel describes ways that Link differs from other methodologies, and some of their recent developments. I can’t comment on that, but some of it does sound like some positive steps. I should also say that I’ve liked what I have seen (at a distance) from MB in recent years – things like some of the changes to Link, their attempt to understand what drives PVR users to either skip or stop and watch advertising, and indeed Nigel’s blog itself. But before this starts sounding like a love letter to Millward Brown, let’s get down to business: some responses to Nigel, and some further thoughts on pre-testing.
One of the things I’d challenged pre-testing on was a tendency to focus on the rational. Now, as Nigel mentions, recently many pre-test methods have shifted to focus more on involvement, emotional relevance and brand engagement. This is certainly true, and a welcome development. The ARF has recently published a document called “Measures of Engagement Volume II” which gives a useful overview of many of the new developments in research methodologies (Link is one of those featured).
Even if we're asking better questions, there remains some debate about whether people really are conscious of all of our emotional responses. On this issue, Nigel observes (quoting Antonio Damasio) that while our decisions are emotionally based, “unless those emotions are experienced consciously, they will have little effect.” So, he continues, “people must be conscious of their reaction if an ad is to have a lasting effect on them.” As might be expected on such an interesting topic, there are differing opinions. Robert Heath, with all of his work on Low Involvement Processing, has come to some different conclusions. In that same ARF document “Measures of Engagement Volume II,” a paper of Heath’s is published as an introduction. In it he also extensively references Damasio, but he shows how conscious attention and emotional engagement are two separate constructs; and that there is “NO direct relationship between levels of attention and levels of engagement” (the emphasis is his). This paper is definitely worth the read.
However, regardless of the methodology or types of questions asked, I still think there’s a fundamental difficulty testing certain types of ads. Much of the problem comes down to stimulus – what do you actually test? I only mentioned stimulus briefly in the original post, but it deserves more discussion.
Let’s separate ads into two very broad camps. The first type of ad has clear rational messages, straightforward action & dialogue, and are set in a familiar setting like a kitchen. These ads are really about messaging. The other type of ad is not about messaging, but about experience – it is reliant on specific casting or special effects or production values, and aims less to inform or persuade than to create emotional response or impact culture.
Pre-testing of new ideas is (by its nature) done in pre-finished form. In the first kind of “message” advertising, it is easy to separate the idea from its execution. The production details like directing and casting generally are secondary. They serve to make a good idea better – more memorable, more interesting – but at a fundamental level, how the ad works remains the same whether produced at a high level or low, or indeed if described in a storyboard or animatic. This makes them easier to test in pre-finished form. And in these cases, I think pre-testing can be quite useful to ensure clarity of communication, relevance of the situation, etc.
In the second type of ad, we cannot separate the idea from the execution: the execution is the idea. These ads have less overt messaging, instead attempting to create a feeling towards the brand. Sometimes this is through telling emotional stories, such as Dove or Axe. Sometimes it is about creating a mood, sort of like a music video, the way Nike and iPod ads often do. And sometimes ads attempt to create something visually remarkable and never seen before – such as the Bravia and Honda work. In the era of YouTube and social networks, all of these try to encourage discussion and passing along, to create an “ohmygod have you seen that yet?” effect. And all rely on the specifics of casting and direction to work. I’ll reuse that quote from Unilever’s Simon Clift from the previous post – “With Dove, for example, it's how the girl comes across – her non-verbal gestures, the cut of her hair and whether she's sympathetic, that determine whether the message is believable. It's not the words put in her mouth…With Axe/Lynx particularly, there's no way you can tell whether this or that babe is going to be appealing to a 16-year-old boy from a line drawing.” We might be able to describe these ideas in simple terms, but their real power relies on how they are produced. At a fundamental level these ads won’t work the same way if they’re produced at a lower level or described in a storyboard or animatic form. Because of this, it can be meaningless to test these types of ads in pre-finished form, for the same reasons it would be meaningless to pre-test a description of an event or a video game or anything else that relies on an actual experience.
In these situations, one could still test a finished ad. For new ads this usually doesn’t make sense because once it’s produced, you’re not really able to significantly make any changes. It does however make perfect sense at other times, such as when you’re considering running a historical ad or an ad from another region to see how it would work with your audience.
So to summarize, I believe that pre-testing is biased towards certain types of ads because they are easier to test in a pre-finished format. This straightforward messaging style of advertising can be very effective. But it’s worth noting that the most highly successful advertising of the past few years falls into the second camp, and that most of those campaigns were not pre-tested.
I’d also like to mention a few other issues raised by Nigel.
Nigel defends the predictive ability of Link by citing strong correlations with in-market sales tracking. It’s impossible to comment without having seen the actual data, but I remain doubtful for several reasons. One is that in my experience with market-mix-modeling and case study writing, I have to say I’m skeptical that the effects of advertising can really be effectively teased out from other market activity (I realize of course that lots of people spend lots of money trying to do exactly that). Markets are a highly complex system, and so many things influence sales that correlations can be misleading. Nigel does mention that some of his data do not “correct for the influence of other marketing variables” and it’s worth considering what that might be. Ads which pre-test well may build increased confidence in the marketing team that created them, translating to higher media spend, longer rotations on air, more excitement from their sales force, and so on, turning the prediction from the pre-test into a self-fulfilling prophecy (this is not necessarily a bad thing, of course). And if Nigel is relying on his clients for data then it also might not be a truly representative sample. I’ve too often seen marketers use their data selectively, discounting periods with poor sales results by blaming them on competitive activity, issues with distribution or sales teams, and so on.
He also makes a statement that is heard commonly – it’s echoed by some of the commenters – that making a decision without the benefit of pre-testing is “going on gut feel.” This makes it sound as if people are making important brand decisions on a whim, which is unfair. Nigel says “I’d prefer to base decisions on informed judgment than gut feel” [the emphasis is his]; however surely a decision made without pre-testing can still be well-informed. It’s a decision made by a team from the client and agency exercising the judgment for which they are supposedly paid, against specific brand objectives, after healthy debate, and ideally based on a significant amount of research early in the process to understand the audience, how they interact with media, and the role of the brand in their lives. To my mind, saying that is going on “gut feel” is a serious under-representation of the work and expertise involved. The sad thing is many marketers – in both client and agency organizations – have lost their faith in their ability to make exactly these kinds of judgments. In my experience, this type of research is more often used as a substitute for judgment, and as a way to minimize the potential for blame rather than as a way to maximize success.
Nigel also helpfully picks up on this point of the mindset of those involved, and we wholeheartedly agree with the need to be open-minded and flexible. He links to an earlier post with the great title “Is the Link pre-test the equivalent of the Smith & Wesson 500?” that, drawing parallels to the gun industry, questions whether research companies do enough to ensure clients use their products safely and correctly. It’s a fantastic question, and Nigel has a thoughtful response that’s worth reading. But I want to mention here that it’s not just clients and research companies that need to examine their motives. Agencies often do ourselves a disservice when we selectively use research results, championing the research (and ignoring any problems) when it supports the agency’s work, while criticizing the same methodology (and jumping on any problems) when an ad performs poorly. I know I’ve been guilty of this.
I hope I don’t come across as hating pre-testing. I think it can have lots of value. In an ideal world, I’d love to have a full understanding of each communication’s power before it was launched. It’s just that as testing is done today, I’m not sure that many of the types of ads we’re all increasingly trying to make can be pre-tested in a meaningful form.
I’ll say a huge thanks to Nigel and all of the people who’ve commented and linked to both here and his post. I don’t think there’s enough dialogue between researchers and agencies and clients about the research we do, so this has been refreshing for me. Thankfully, in the end, I actually think we agree on as much than we disagree. And I hope we can continue to talk about this. So I’ll open it to all of you again: anybody agree, disagree, or have anything to add?
I think Lord Saatchi has the perfect answer: http://www.ft.com/cms/s/9c40e788-0e08-11dc-8219-000b5df10621.html
Posted by: Hans Suter | May 30, 2007 at 04:18 AM
Jason, while following your spicy debate with Nigel I was thinking that instead of measuring IPA winners vs. Link, I would challenge MB to find a way to show if 2005-2006 Cannes/Clio/Epica film winners scored in Link BETTER than Cannes finalists.
In my opinion, the main problem with Link and other pre-tests is that they're going with the flow, not against it, as consumers are thinking mainly communication cliches (they'll tell you what they are used to see on TV). Simpler said, Link & its brothers determines marketing folks to take "safe calls" and this habit is giving headaches to planners and creatives.
I remain skeptic vs. their pre-testing approach. ;-)
Posted by: Stefan Stroe | May 30, 2007 at 08:23 PM
Hey Jason,
Read your blog,or most of it, and sent it on to a friend. He asked what Link was which prompted this reply. I'm being lazy and not modifying it for you, but perhaps you won't mind.
Pete Gardiner
Link is/was a Millward Brown testing methodology that got its name from the premise that its results correlated/linked to in-market results. If an animatic got a good link score, then the finished commercial would do what it was supposed to do in-market. Test subjects would sit at a computer in a room with about 20 other respondents and watch a series of 5 commercials, some finished and some in animatic form. They would answer a series of general warm-up questions (by typing in answers on a keyboard) about what they had just watched. Then they would watch only the animatic being tested, and would answer questions intended to find out clarity of message, brand link, intent to purchase, and whatever else had been agreed upon to be tested. Interestingly, Link also had a feature where people could operate a joystick as they watched the animatic. They would move the joystick to the right when they liked what they saw, and to the left when they didn't like what they saw. The middle was neutral. What you got was a second-by-second chart showing each viewer's feelings about the animatic.
I liked the comparison of Link to a Smith and Wesson: powerful tool, yet dangerous in the wrong hands. My view is that Link overpromises. I remember what one of my profs in university told me about the Salem witch trials. He said the amazing thing was how rational and orderly the trial was. The rules and process of English common law worked smoothly. The only weird part was that the trial proceeded from the premise that there were witches. Link proceeds from the premise that a person responding to an animatic in a research room at a computer prompted by a screen of questions will accurately predict how a finished ad will do in-market. What about the quality of the finished ad? What about media weight. Too much? Too little? What about poor distribution or bad shelf placement or high price? The good folks at Millward Brown earnestly try to quantify it into a graph showing second-by second interest. They add charts and graphs and mean scores and their own language, and the whole thing becomes quite impenetrable and difficult to challenge. And when ad agencies do challenge, we're accused of being defensive.
When one of my spots was being tested, I asked to take part in a Link test myself. I was surprised that my joystick was actually quite difficult to move either left or right. So if someone is physically strong, they really like it, and if they have arthritis, they're quite neutral to the whole thing. I don't believe my complaint made the slightest dent in their statistical armour.
I think it's important for all of us to brutally cross-examine the witness of Testing as often as possible -- really Eddy Greenspan them. They get too free a ride from us all. Or we could do what Rethink does, and refuse to have any part of it. There are no witches.
Pete
Posted by: Pete Gardiner | June 01, 2007 at 10:25 AM
Hi Jason,
I could not resist joining in the dialogue on the two questions your client posed. Since based on our own research it is vital to emotionally connect with people i.e. engage them then I feel that copy testing has to move away from a focus on advertising messaging - clarity, credibility ... - to a focus on the customer and how she/he connects and receives the brand's unique DNA.
Time and budget permitting you could consider using an MRI, EEG, biologically based measurement techniques. One client I know simply likes to observe people's facial expressions. As far as surveys are concerned even OTX, for whom Robert Heath consults, use 1 to 5 ratings of feelings. Gerald Zaltman likes to use metaphor elicitation techniques. I like to use some of these solutions and impact on the brand type techniques.
People may not be sure about things they are unfamiliar with but is there any evidence to suggest that they eventually like what they initially dislike?
To your point on "content" ads only, one of the best liked ads we have ever reported on featured 3 frogs in a marsh burping Bud Wise and Err to a flickering neon sign. It had no messaging takeaway but it had a powerful emotional connection with people.
I would say that Dove's Evolution viral spot would do well in a pre-test as long as the finished spot was tested and not a stick drawing version which I could not imagine. Equally Coca-Cola's Happiness Factory "The Coke Side of Life" I predict would do very well. Both just won Creativity awards with Dove's Evolution getting the Grand Prize nod.
At the risk of sounding self serving, I guess I feel that a one size fits all testing technique could work for all advertising as long as it is measuring Engagement quality and communication of the unique DNA of the brand advertised in the context of the advertising objectives set.
In answer to the second "can copytesting predict" question if the copytesting technique is not measuring the Engagement quality of the ad and the intended DNA for the brand then I do not see how it can predict accurately.
David Ogilvy, initially a researcher, came up with the lamp post phrase, and really believed in good research in helping to develop advertising that connects with people.
In our YouTube world where consumers create and co-create advertising wouldn't it be nice to go beyond stick drawings in portraying the creative intent?
Posted by: Mike Gadd | June 05, 2007 at 04:01 PM
Thanks for (re-) initiating a great discussion.
The one aspect of pre-testing that bugs me the most is whether or not we can really learn how to make a spot better from the research. Despite the "diagnostic" questions or the use of open-ended questions, without the ability to probe and question people about their understanding of the spot are we really learning enough.
Ultimately, using this research (or any research) with a pass/fail approach is just a sign of poor business self-confidence . I find it puzzling that Nigel Hollis would not be an advocate of doing great strategic research upfront and relying on a clients experience about whether the ad delivers on that. At the very least, let's accept pre-testing as an imperfect indicator, not a fait accompli.
Posted by: mark | June 14, 2007 at 03:13 AM