Tuesday, June 29, 2010

Trouble In Polling Land

There's been a very interesting development this week in the world of progressive polling, as Talking Points Memo reports:

Calling into question years worth of polls, Daily Kos founder Markos Moulitsas said today his site will sue pollster Research 2000 after a statistical analysis showed that R2K allegedly "fabricated or manipulated" poll data commissioned by Kos.

Two weeks ago, after Kos dropped R2K for inaccuracy, a group of three of what Kos calls "statistics wizards" began looking at some of the firm's data and found a number of "extreme anomalies" that they claim may be the result of some kind of randomizer.

Kos Alleges Top Pollster Provided Bogus Data, Will Sue

[links from original]

I would not have used the term "randomizer" to describe what those statisticians found. In some cases, it looks like the results were not random at all. Their first case involves a weekly favorability rating poll for President Obama:

A combination of random sampling error and systematic difference should make the M results differ a bit from the F results, and in almost every case they do differ. In one respect, however, the numbers for M and F do not differ: if one is even, so is the other, and likewise for odd. Given that the M and F results usually differ, knowing that say 43% of M were favorable (Fav) to Obama gives essentially no clue as to whether say 59% or say 60% of F would be. Thus knowing whether M Fav is even or odd tells us essentially nothing about whether F Fav would be even or odd.

Thus the even-odd property should match about half the time, just like the odds of getting both heads or both tails if you tossed a penny and nickel. If you were to toss the penny and the nickel 18 times (like the 18 entries in the first two columns of the table) you would expect them to show about the same number of heads, but would rightly be shocked if they each showed exactly the same random-looking pattern of heads and tails.

Research 2000: Problems in plain sight

I take slight issue with this, in that there are some types of questions that would tend to be either both even or odd. Anything where there are only two possible answers, for instance, would be one. They'd have to add up to 100, so most of the time it would be either two odd or two even numbers. There were three choices in this poll, with "undecided" being the third, so more random odd-even pairings would be expected. Leaving that aside, though, the coincidences were way more than what can be explained away easily:

Were the results in our little table a fluke? The R2K weekly polls report 778 M-F pairs. For their favorable ratings (Fav), the even-odd property matched 776 times. For unfavorable (Unf) there were 777 matches.

Research 2000: Problems in plain sight

One might ask why, if Research 2000 were actually making up the data, they didn't pick less suspicious numbers. For that question, I have no answers. These are among the sorts of anomalies, though, that mathematicians look for when they are trying to find fudged or made up data.

The third case, comparing the randomness of Gallup's favorability poll with Research 2000's, strikes me as being even more compelling:

[L]et’s look at the same for the weekly changes in R2K's first 60 weeks. There are many changes of 1% or -1%, but very few of 0%. It's as if some coin seemed to want to change on each flip, rarely giving heads or tails twice in a row. That looks very peculiar, but with only 59 numbers it's not so extremely far outside the range of what could happen by accident, especially since any accidental change in week 2 shows up in both the change from week 1 and the change to week 3, complicating the statistics.
If we now look at all the top-line changes in favorability (or other first-answer of three answers) for the last 14 weeks, we see a similar pattern.

Research 2000: Problems in plain sight

One would expect that, in week to week polling, there would be times when there was no change in the results from one week to the next. In the case of Research 2000's poll, there were very few, compared to Gallup. That particular anomaly occurred over two different periods in the space of a year.

I urge anyone with questions to check out the article. I'm not any kind of mathematician, let alone an expert in statistics, but it looks to me like there is reason to be skeptical of these poll results.

The reason I found out about this was that I happened to see the title of this article by Nate Silver at Five Thirty-Eight:

About 15 minutes ago, I was sent a cease and desist demand by Howrey LLP, the lawfirm that Research 2000 has contracted with to defend it against Daily Kos, which is suing it for fraud based on evidence that its polling may have been fabricated.

The cease and desist letter, which is published below, attributes to FiveThrityEight statements that were made by others. It alleges that "you have engaged in a campaign to discredit and damage R2K by posting negative comments regarding Mr. Ali, the Company, and its work products on the "Daily Kos" blog. It further threatens a lawsuit, unless I "immediately cease and desist all such activities, and retract all previous publicly transmitted statements."

Research 2000 Issues Cease & Desist Letter to FiveThirtyEight

Nate had previously revised his ranking of political pollsters, in which Research 2000 scored below average (more correctly, below Nate's default rating for a new pollster) on both rating scales.

In searching my blog for articles related to Research 2000, this was the most notable:

After the Massachusetts special Senate election, Democracy For America commissioned a poll to find out what issues affected the outcome.
Note that overall, and among independents, the public option is favored more than 6 to 1. Even among Republicans, the margin is nearly 3 to 1. These are only the people who voted for Obama in 2008 and for Brown in 2010, but they represent the sort of folks who supported Obama, and in all likelihood had considered voting for other Democrats, as recently as 2008.

Health Care: The Lesson To Draw From The MA-Senate Race

I've mentioned that poll several times since. I'm not aware of any other similar polls, so, in contrast to issues like President Obama's favorability ratings, there is nothing to compare it to in order to validate it. It's difficult to trust the results of that poll, at least until the questions about Research 2000's methods are cleared up.

What we should probably take away from this, for now, is that whatever we think we know about public opinion due to Research 2000's polls must be validated against other polls. In the case where no similar polls have been done by other pollsters, as was true of the post-MA Senate special election poll, the results must be viewed skeptically.

UPDATE: Nate Silver had written another article earlier today saying that he felt there was something to the "three statistics wizards"' case.

In his article announcing his intention to sue Research 2000 today, Markos Moulitsas asserted that they had promised to release all the data from the polls they did for Daily Kos, but they later refused.

Meanwhile, Research 2000 contends that the real problem is that Daily Kos hasn't been paying its bills.

UPDATE 2 (June 30): Nate Silver points out a problem with the data in the Research 2000 tracking poll for President in 2008:

A lot of pollsters would have been reluctant to do this because the sample sizes were quite small -- on average, about 360 persons for each daily sample -- and presumably would have revealed rather striking variation from day to day simply due to sampling error. The margin of error on a sample size of 360 is +/- 5.2 points, so it would be fairly normal for Barack Obama's numbers to careen (for example) from 54 points one day, to 48 points the next, to 52 the day afterward.

But in fact, this didn't happen. In fact, their daily samples showed barely any movement at all. In the 55 days of their tracking poll, Barack Obama's figure never increased by more than 2 points, nor declined by more than 2 points.

Nonrandomness in Research 2000's Presidential Tracking Polls

It looks to me like Research 2000 was either asking the same people the same question each time, or there is a problem with the poll. In fact, that's almost redundant; if they were asking the same people that question every time they published a poll, that's some seriously bad methodology.

No comments: