Thursday, October 2, 2008

Cracking the Muhlenberg Code

Why do the polling firms conducting daily tracking polls not release the results of the individual days along with the rolling averages they are producing? Ah, wouldn't it be nice. In reality, they don't and that leaves people like me wondering about how to go about determining what those numbers are. As the good folks at FiveThirtyEight showed, however, it is an inexact science that just has too many unknowns to definitively determine that data. What they looked at was the national trackers. But what I'm more interested in are the state level daily trackers. We only have one so far and it is the poll conducted by the Institute of Public Opinion at Muhlenberg College along with Morning Call. So, what do we know and what don't we know?

[Click Figure to Enlarge]

Above you'll see, from the Institute itself, the results from each of the first six days of the tracking poll in question. Now, to keep things simple, we'll focus just on margins and not each candidate's percentage. The latter just replaces one unknown with two more and that muddies the picture that much more. What we see, though, is that the margin increases by one for each of the first five days of the five day rolling average. That tells us that for each of those days the new data coming into the average (an additional days worth of data) has a margin that is about five points greater than the day that is being subtracted from the rolling average.

Huh? How do you know that? Well, each new release has five different days worth of polling. For instance the first release last Friday found that between the dates of September 21 and September 25 Obama averaged a four point lead over McCain. We know that the sum total of those five polls is somewhere around 20 (5 polls X 4 point margin). [I continue to say "around" because we are dealing with rounded numbers here. Again, that is another way in which this examination is oversimplified, but it is the information we have available to us.] If the average increases to five, then we know that that total increases to 25 (5 polls X 5 point margin). And as that one point increase pattern continues, you simply add five more points to the average until things either level off or drop.

If we assume that each of those days in the first poll released last Friday had the exact same result, then we would see something like this:

Muhlenberg Tracking Polls - PA (Oct. 1):
Scenario #1

Date
Polls
2122
23
24
25
26
27
28
29
30
#1
4
4
4
4
4





#2

4
4
4
4
9




#3


4
4
4
9
9



#4



4
4
9
9
9


#5




4
9
9
9
9

#6





9
9
9
9
-1

So if every day in the original poll found Obama to be up four, he would have to be up nine on the sixth day when the first day's four point margin is phased out. We can continue that replacement and it looks just fine until you get to yesterday's poll, when the margin dropped by one. Not only does the above hypothetical assume things into simpliticity, it is also probably unrealistic. I think it is unlikely that there was a McCain blip in there, as the -1 would imply.

It is not unrealistic, then, to forcibly constrain things to the Obama side of the ledger. But what that tell us is that, if the original average was four points and if there was a drop in the margin yesterday, then there must have been an inflated number on the last day (September 25) of the initial poll. That would be the day that was phased out in the data that the Institute released yesterday. To frame this slightly differently, there was likely something of an established pattern to the data from September 21-24, but the data from the 25th was a significant departure from that pattern, one that altered things thereafter. Below, I think, is a good guess as to how things may have looked over the course of the first ten days of the tracking poll.

Muhlenberg Tracking Polls - PA (Oct. 1):
Scenario #2

Date
Polls
2122
23
24
25
26
27
28
29
30
#1
2
3
2
3
10





#2

3
2
2
10
7




#3


2
3
10
7
8



#4



3
10
7
8
7


#5




10
7
8
7
8

#6





7
8
7
8
5

We know that since there was a one point drop in the margin yesterday, that there was around a five point drop in the total of all five days' polls. When we additionally factor in the fact that there was a drop, but one that still showed a pretty good lead for Obama, you get a pattern similar to what is depicted above. Things were close and then, suddenly, they weren't.

Is this right? I don't know. As I said this is just a guess. More importantly, how is FHQ going to deal with the data from Muhlenberg? We are, like what is being done at Electoral-Vote.com, going to take one poll every six days. That way no one day in this series of surveys is being counted more than once. The first poll Muhlenberg released covered September 21-25. We would take that poll and then the one where the 25th is phased out of the average (the poll covering September 26-30). That decision was reflected in last night's update of the map.


Recent Posts:
The Electoral College Map (10/2/08)

Here's the Deal...

The Electoral College Map (10/1/08)

15 comments:

Unknown said...

The real killer in this kind of analysis is the rounding. A drop from 49 to 48 may actually be a drop from 48.6 to 48.4 or it might be from 49.4 to 47.6. In the second case there is roughly ten times as much variation to explain.

Anonymous said...

Exactly. It makes the guessing that much harder.

[I was going to say educated guessing, but that sounds a bit pompous, doesn't it?]

They (the pollsters) obviously have the daily numbers, why not just go public with that? And hey, why not throw us a decimal place or two? We don't ask for much.

This is an "if I won the lottery I'd..." scenario, but if it were me, that's what I'd do (said the guy who got barked at for not laying out his methodology clearly enough last night -- I'm kidding, Jack. I'd rather have the input than not.). In fact, I'm getting ready to put in an application at High Point University in North Carolina, for a professor/polling director position. Perhaps I'll have that chance. Ha!

Robert said...

Yes, rounding was a mess during the primaries. I finally had to get a calculator as the percentage differences were rounded to 2%. I needed a calculator.

Jack said...

Josh:

To borrow one of your lines, is there any place I can report abuse here?

In all seriousness, your methodology was laid out clearly - I just misread it.

Anonymous said...

Ha!

Jack,
It is relatively clear, but the main issue is that it isn't easy to find. I probably need to fix that. But hey, at least I'm not RCP today. Though, I will admit that that post made me think of our discussion here last night.

Jack said...

Yeah, Nate really laced into RCP, even if he's retracted a bit of it.

Is there anyplace where you posted exactly how you weigh the polls? I don't mind looking through the archives if there is one.

I see McCain's pulling out of Michigan?

Anonymous said...

Jack,
There is, but I may not have updated it when I made the switch in how many recent polls I was weighting.

Essentially, the most recent poll counts as two-thirds of the average and all the remaining polls (back to Super Tuesday) account for the remaining third.

It isn't that complicated really. That's why I got a kick out of someone calling it an algorithm on another site earlier this year. Now, four years from now, I may opt for a more complicated version, but this has worked pretty well this time around.

Anonymous said...

This Michigan thing is big and that's an understatement. This isn't like Obama pulling out of Georgia. This is a major potential piece to McCain's electoral college math being taken off the table. Was it a realistic piece? It doesn't look like it, but symbolically this is a big blow to their campaign.

As an aside, why is this stuff coming out right as I'm leaving the office every day? I'm staying late tomorrow.

Jack said...

You should send the campaigns and news media your schedule. I'm sure they'd be happy to work around it.

Anonymous said...

See, that's what I was thinking. Some of those Michigan McCain folks can come work for me.

Jack said...

And you're right, McCain pulling out of Michigan sounded a bit too good to be true, though I'm seeing it all over the place. One of the trolls at 538 said McCain's pulling out because he already has Michigan in the bag.

I guess the economy stuff would play well there, even if some are surprised it took this long.

Still, if McCain loses Michigan, Wisconsin and Minnesota, he has to win ... Colorado, Florida, Indiana, Ohio, Missouri, Virginia and North Carolina and one of New Hampshire and Nevada. Which makes it way too important to concede.

Unknown said...

Yeah, Michigan is a big deal. It always had the chance of zigging when other states zagged: it's economy is so heavily influenced by a single high-profile industry, and there are lingering complications from the Primary mess. One weird Obama gaffe related to the auto industry could have made it close again.

I think McCain is making a mistake. It and New Hampshire are the only blue states I can think of where he had a realistic shot at "stealing" it--i.e. winning them without a substantial national popular vote margin. McCain could win Pennsylvania under some scenarios, for instance, but I bet he couldn't do it without winning the national popular vote by at least 2 points.

Jack said...

Apparently I also assumed New Mexico and Pennsylvania were going blue in my last post. While that seems reasonable, I did mention Wisconsin and Minnesota as a condition and should have included NM and PA in that caveat too.

Jack said...

One last comment before I'm off for a few hours: Gotta love the McCain campaign. Spends two days in Iowa and gives up on Michigan.

Maybe Republicans are trying to win over disaffected Hillary supporters by running their campaign as badly as hers was. You know, showing solidarity, that kind of thing.

Unknown said...

And then there's this. It does validate Maine's decision to allocate their EV's by congressional district. Maybe the extra attention and money they get will encourage a few other states to follow suit in 2012.