Frontloading HQ: biographical data

Showing posts with label biographical data. Show all posts

Wednesday, November 18, 2009

Obama v. Palin in 2012? One Forecast is Already In

A month and a half ago, FHQ posted a link to and discussed a presidential election forecasting model built on candidate biographical information. The benefit of this model -- and it performs quite well stacked up against other forecasting models -- is that the biographical data exists now. In other words, you don't have to wait until the second quarter economic numbers are released or to wait on polling data from a particular period of time in the election year to put an accurate forecast together. [But hey, if you want to continue to come here and watch FHQ wade through the quadrennial polling data on the presidential race, we won't fault you. We here at FHQ may go so far as to encourage it.] I left off in that post urging folks to start scouring the biographical data on the prospective 2012 Republicans.

But why do that? Well, if you're patient, you'll be pleasantly surprised by an email from the authors of the original research. And lo and behold, one of those co-authors, Andreas Graefe (the other is J. Scott Armstrong), emailed me this morning to inform me that -- yes, that's right -- they've already looked at the Obama v. Palin numbers. How does Palin fare against the President?

[Click to Enlarge and here for the full description of the 2012 update at PollyVote.]

That nine point difference between the two candidates' biographical indicators translates to Obama carrying a 59.6% share of the two-party vote in 2012 if this was the match up (For some context, Obama received 52.9% of the vote in 2008 or 53.4% of the two-party vote). That's Reagan-Mondale territory and would likely make for quite the electoral college sweep for Obama.

But didn't you say that this model wasn't particularly adept at picking elections involving incumbents? (Ah, you followed the link and read the previous post, didn't you? Thanks.) That's right. Three incumbents with biographical score advantages lost re-election bids (to: Truman '48, Carter '76, Clinton '92). It has been done, then, but let's look a little more closely at those three elections. Carter and Truman had deficits of 5 points on the biographical index while Clinton trailed Bush by just three points. Palin's disadvantage against Obama is over twice the average deficit across those three incorrectly predicted elections, though.

That's a real hole to be in even before you start considering running for president. But back to my question from the last post: Who among the 2012ers does the best?

A special thanks to Andreas Graefe for drawing our attention to the updated 2012 outlook.

Recent Posts:
St. Cloud St. Poll: Obama leads Pawlenty in 2012 Horserace in MN

Twenty Ten or Two Thousand Ten?

A Follow Up on Palin and Winner-Take-All Presidential Primaries

Wednesday, October 7, 2009

Predicting Presidential Elections from Biographical Information

Why crunch a bunch of numbers via regression to forecast a presidential election, when the candidates' biographical data seemingly gets you closer to the actual results? I don't know. This won't put number crunchers out of business (Good, I didn't waste 2008 after all!), but the findings from a study by Armstrong and Graefe do shed light on an interesting new avenue by which elections outcomes can be predicted. Here's how they constructed their model:

"We created a list of 49 cues from biographical information about candidates that were expected to have an influence on the election outcome. Then, we estimated whether a cue has a positive or negative influence on the election outcome. ... We distinguished two types of cues: (1) Yes / no cues record whether a candidate shows a certain characteristic or not. (2) More / less cues are more complex as they also incorporate information about the relative value of the cue for the candidates that run against each other in a particular election. In general, the candidate who achieved a more favorable value on a cue was assigned a value of 1 and 0 otherwise. For more information on the coding see Appendix 1. Finally, the sum of cue values for each candidate in a particular election determined his PollyBio index score (PB)."

And what did that yield? Out of the 28 elections between 1900 and 2008, the candidate with the highest PB index score won 25 times (see below).

Source: Armstrong and Graefe (2009). "Predicting Elections from Biographical Information about Candidates"

My first thought was, "I'll bet they missed the close ones." Well, those are the types of elections most of the forecasting models have the hardest time predicting. But that wasn't necessarily the case here. The Armstrong and Graefe model missed 1948 (Truman), 1976 (Carter) and 1992 (Clinton) and on the former two had company from other noted forecasting models. The only notable miss was Clinton's election in 1992.

"PollyBio failed in predicting the correct winner for the three elections in 1948, 1976, and 1992, in each of which an incumbent president was running. A look at the data helps to explain the failure for these three elections. Gerald Ford in 1976 and George Bush in 1992, who were both wrongly predicted to win, had particularly strong biographies. For our set of ‘yes / no’ cues, which did not include relative measures between candidates (like height, intelligence, or attractiveness), Ford and Bush achieved the highest score of all 56 candidates in our sample (together with Theodore Roosevelt in 1904 and William McKinley in 1900). By comparison, Harry Truman, who PollyBio failed in predicting to win the 1948 election, scored particularly low on the same set of cues. Being the only U.S. president after 1897 who did not earn a college degree, Truman achieved the lowest score of all incumbents in the sample. Among all candidates, only three achieved a lower score."

What was the common theme? A switch in power from one party to the other? They are all Democrats -- Southern Democrats at that (Fine Missouri's a border state.). No, those weren't it. All three elections involved incumbents. The model seems to do better in open seat races than in those where incumbents were involved.

So why wait for election day in 2012? Start comparing the bios of the prospective Republican candidates against Obama now. Who stacks up best? (My guess is Romney or Gingrich.) Hey, it is a race that involves an incumbent.

Hat tip to Political Wire for the link.

Recent Posts:
The 2012 Presidential Candidates: Pawlenty and Petraeus

State of the Race: New Jersey (10/6/09)

Here's what things would have looked like in New Jersey had the Rasmussen poll been released tomorrow.