Wednesday, September 29, 2004

My own polls study

I've been tinkering around with polls.

I came up with the following nifty graphic (click to enlarge):
Kerry and Bush polls over time

Kerry and Bush polls over time-legend

The are above the zero line indicates support for Bush, and the area below the line indicates support for Kerry. Points in gray indicate polls of registered voters, while black points indicate "likely" voters.

Since I have a job unrelated to polls and a family (also unrelated to polls), I assumed the polls are comparable and so included every one in the analysis. Here's a few thoughts.

It looks like Bush is gaining support over time, although there is so much noise I'm not sure what confidence I'd put in any lead at this point.

Fox polls, which I comment on because Fox News seems to be a favorite whipping boy of the left, somewhat consistently showed more favor for Kerry than average. Gallup consistently showed more favor for Bush, sometimes significantly more so as in the recent poll showing Bush with a double-digit lead.

The datasource is PollingReport (updated 9/27). After copying the numbers (Bush/Kerry numbers only) into a spreadsheet, I loaded the data into the R statistics program. It took a little manipulation to turn the dates from character strings into date objects, and I subtracted Kerry's support from Bush's to create a margin of support (with positive numbers favoring Bush). I then created a scatterplot using a different letter for each poll (as indicated in the legend). The curve through the data is created using lowess. (Lowess is like a weighted moving average, which weights "the data point you're on" the highest and those farther away lower. It's similar in spirit to the linear regression you do in basic statistics, except it does curves.) The smoothing parameter was 0.5.