Latest national poll median date: October 20
Projections reflect recent polling graciously made publicly available by pollsters and media organizations. I am not a pollster, and derive no income from this blog.

Wednesday, July 31, 2019

Criteria for Poll Inclusion

As some of you may have noticed, some pollsters are pushing back against their polling results being used in seat projections. To the extent that some media outlets may be free-riding on their work for financial gain, I understand their frustration. On the other hand, if these firms release their political polling results publicly, they gain notoriety, which may help, for example, their market research business. So one could argue that polling firms can't have it both ways, especially given the fair dealing provision of Canadian copyright law.

This argument poses an obvious ethical dilemma for this blog. First, I would like to point out that I am not making a penny from making these projections: this blog is not monetized (as you can see, there are no ads), and because I am doing this anonymously, there is no prospect of a media outlet hiring me.

Nevertheless, as a producer of knowledge in my real job, I sympathize with pollsters' desire to protect their intellectual property. Therefore, going forward, my projections will only include polls satisfying both of these conditions:
1. the poll results have been publicly disclosed by the polling firm, a member of the polling firm, or a person/organization that directly obtained permission to disclose the results (such as a news outlet that commissioned the poll), and
2. it is not the case that every disclosure covered under #1 states that the use of the results for seat projections is prohibited.

For example, I will not be using:
- polling results that are behind a paywall and leaked (including regional results, even when the same poll's national results are public);
- results from a pollster that does not want their numbers fed into a projection model, if this is clearly indicated in ALL disclosures of poll results by the pollster or an authorized entity.

However, I will feel free to use:
- polling results that were put behind a paywall after being publicly disclosed;
- results that are disclosed (by the pollster or an authorized entity) without a mention that they should not be used in a projection model, even if other disclosures of that same poll carry such a mention (e.g. pollster says "don't use these," but their client news organization doesn't).

To show appreciation to pollsters that remain open to having their numbers used (which is still, fortunately, most of them), I will, for any poll used in the projection:
- refrain from giving the poll's top line numbers on this blog (except in rare cases where it is necessary for my commentary);
- link, as much as possible, either to the pollster's website or to the website of a news outlet having commissioned the poll (rather than directly to the PDF or to a third party source);
- avoid, to the best of my ability, providing commentary on that poll that overlaps a lot with the pollster's commentary on that poll. (I may still provide commentary on aspects the pollster is unlikely to comment on, such as how the poll compares with my polling average.)

I hope that these measures will encourage you, the reader, to visit the pollster or the commissioning news agency's website. If pollsters or their clients benefit from projection websites such as this one, they will be less inclined to restrict the use of their numbers or hide them behind a paywall.

Finally, from time to time, I will be removing from the left right-hand column of this page pollsters that restrict, on their own website, the use of both their latest poll and all federal voting intention polls they conducted within the past month. Some polls from such firms may still be included in the projection if, for example, a news outlet commissioning them made the results available.

New Poll Weighting Methodology

Update Aug. 26: This post discusses national weights. Regional weights are discussed here.

One of the main methodological changes that I have made to the model concerns the weighting of polls. In the old formula, poll weights depended on sample size, recency and whether the same pollster has a more recent poll. Going forward, they will also depend on the presence of ALL other polls.

The old formula consisted of a number of ad hoc discount factors based on what seems sensible. It seemed to work OK most of the time, but when shifts in poll numbers occurred, one always wondered if the formula was too slow/quick to react.

The new weighting formula tries to approximate the optimal variance-minimizing (linear) formula under certain straightforward assumptions. That is, it is grounded in statistical theory rather than just being what seems reasonable to me. These assumptions will also allow me to propose approximate confidence intervals (to be explained in another post) not just for "what would happen if an election took place at the same time as the most recent poll," but also for Election Day.

Unfortunately, the new weights are derived through solving a system of (linear) equations (one equation per poll), so I can't tell you exactly when a poll would be discounted by 30% vs. 50%. Instead, I'll point out some notable effects of the changes:

- An old poll will be now discounted more aggressively if there are many other old polls. This makes sense as old polls' errors relative to current voting intentions are correlated through the change in voting intention since they were conducted. This can be very significant: the 6/27-7/2 Mainstreet poll of 2,651 would have a weight of over 50% of the most recent poll (Forum 7/26-28, 1,733 respondents) if they were the only two polls. But due to all the intervening polls, that Mainstreet poll currently counts less than 1/8 as much as the Forum poll.

As a result of this change, I will not need to change the formula to discount more aggressively when polls become more frequent: the model will automatically take care of this, and do it right.

- In certain cases, a poll can have negative weight! This appears surprising at first (my reflex was that I made a programming mistake), but it's actually not that hard to understand. Suppose there are two pollsters, A and B. Pollster A has conducted one recent poll and one old poll. Pollster B has conducted only one old poll. In order to guard against pollster A's potential in-house bias, the model wants to put significant weight on pollster B's poll. But that might put too much weight on an old data point, so it may be optimal to assign a slightly negative weight to pollster A's old poll.

Here are the details. I assume that polls have four potential independent sources of error for estimating current support:
1. Sampling variance: this is the pure statistical error, which is what a poll's "margin of error" refers to.
2. Changes in public opinion since the field dates: I mostly assume that voting intentions follow a random walk (a slight adjustment is made for potential short-term momentum).
3. The specific pollster's in-house bias: pollsters vary in their methodology.
4. Bias common to all pollsters: pollsters' methodologies may have common flaws.

Of course, there's no way to reduce #4 through averaging polls, so the weighting formula ignores it. (But it is very important in getting the right confidence intervals.)

#2 implies that any two polls' errors have a common component: the evolution of public opinion since the more recent poll. The more a poll's error is correlated with other polls', the less informative it is, and the less it should be weighted. Therefore, all polls' weights should depend on each other (over and above other polls "changing the denominator"). How much this is the case depends on how fast one thinks public opinion evolves.

Voting intentions tend to be much more volatile during a campaign - especially late - than before a campaign. Therefore, I will compute each poll's normalized age a by valuing the number of calendar days since the poll's median field date ("calendar age") as follows (updated with the tentative date of the first debate: October 7):
- Before September 1: 0.1
- September 1-20, before writs are issued: 0.2
- September 1-20, after writs are issued: 0.5
- September 21-30: 1
- October 1-7: 1.5
- October 8-21: 2
For example, a July 15 poll will be considered 3 days old on August 14. A September 15 poll will be considered 7.5 days old on September 25 (5*0.5 + 5*1), as writs must be issued by September 15. An October 10 poll will be considered 14 days old on October 17.

I assume that the variance of the evolution of public opinion over a normalized time period of length a is V(a) = 0.0001a. Assuming a normal distribution, this implies that, in a given week, for a given main party (i.e. 30%+ in the polls), there is approximately a 30% chance of a change in support exceeding:
- 0.8% before September 1
- 2.6% toward the end of September
- 3.7% in October, after the debate
I didn't do any formal analysis for this calibration, but to me, eyeballing how things evolved in the past two campaigns, this passes the smell test.

This formula implies that changes in voting intentions are uncorrelated over time. On a weekly scale (which is the scale of my "eyeball calibration"), the assumption is not too crazy. But on a daily scale, it's a poor assumption: "momentum" is clearly a thing when key movements happen during a campaign. To partly correct for this:
- Before the writs are issued, I adjust a poll's calendar age down by 1 day if it's older than 1.5 days, and down by 2/3 if it's less than 1.5 days old.
- After the writs are issued, I adjust a poll's calendar age down by 2 days if it's older than 4.5 days, down by 1-2 days if it's 1.5-4.5 days old, and down by 2/3 if it's less than 1.5 days old.

Finally, factor #3 is taken into account by assigning a individual-pollster variance of 0.0001 0.0002 (updated per this Aug. 10 post) (that is, 1 ~1.4% standard deviation). I don't have any great confidence in this parameter value, so let me know if you think it's unreasonable. But remember that this does not take into account any sampling variance, so it's normal that polls (even those in the field at the same time) vary by much more than this implies. The effect of this is that a poll will have more weight if there are fewer other polls - especially if there are no more recent polls - by the same firm. (The old formula crudely approximated this by applying a 50% discount to any poll that is not the firm's most recent one.)

What I'm NOT doing (and would ideally be doing if I had unlimited time)
- Except for the small adjustment mentioned above, I do not take into account serial correlation in changes in voting intention (in particular, I do not project forward from past trends), and the parameters I use are from "eyeballing" past patterns rather than a rigorous analysis.
- I am not adjusting pollsters' results for bias. That is, if pollster X consistently has better results for Party A than other pollsters, I am not adjusting X's numbers for A downward. Getting such adjustments right would require an analysis of recent years' polls for which I don't have time. Moreover, they should roughly cancel out across pollsters at crucial times in the campaign when most pollsters publish a poll. At quieter times, though, this will cause the average to be a bit more topsy turvy than it should be.
- I am not grading the quality of pollsters (and accordingly modifying the weights). As above, lack of time.
- I will maintain an ad hoc approach when incorporating provincial/regional polls. They contain useful information, so I won't ignore them and will "eyeball" the weight they should receive. However, dealing with them systematically (in particular, separately computing the weights on the provincial numbers of national polls) doesn't seem worth the trouble. I now deal with regional polls systematically (though not quite optimally). Details in the regional weights post.
- Riding polls will not factor into polling averages (though they can inform riding-level adjustments).

Sunday, July 28, 2019

Comparison of 2015 Projections

This is a post I started writing after the 2015 election, but never finished. I provide it in its unfinished state, "for the record."

2015 Final Projections (italics = not model-based)
178-115-44-  1-0-0 (37%, 31%, 22%, 4%, 4%) CVM Election Model
177-  95-53-11-1-1 (38.0%, 30.9%, 21.3%, 5.7%, 3.4%) Teddy on Politics
160-120-50-  7-1-0 (36.7%, 32.0%, 20.4%, 5.2%, 4.0%) The Signal
149-105-81-  2-1-0 David Akin's Predictionator
147-115-67-  7-2-0 (39.3%, 32.4%, 19.4%, 4.1%, 4.1%) Sauder Prediction Market
146-118-66-  7-1-0 (37.2%, 30.9%, 21.7%, 4.9%, 4.4%) ThreeHundredEight
142-119-66-10-1-0 (37.3%, 32.4%, 20.1%, 4.9%, 4.3%) Canadian Election Watch
142-116-68-11-1-0 Election Atlas
140-115-79-  3-1-0 LISPOP
138-117-76-  6-1-0 (36.8%, 31.8%, 22.7%, 4.2%, 4.5%) Le calcul électoral
138-120-75-  1-1-0 Election Almanac
137-120-75-  8-1-0 (36.8%, 32.5%, 21.4%, 4.6%, 4.1%) Too Close to Call
128-120-83-  5-2-0 Election Prediction Project

Sum of absolute seat deviations (divided by 2)
11 Teddy on Politics
16 CVM Election Model
27 The Signal
40 Sauder Prediction Market
41 ThreeHundredEight
42 Canadian Election Watch
42 Election Atlas
43 David Akin's Predictionator
49 Too Close to Call
50 Le calcul électoral
55 Election Almanac
61 Election Prediction Project

This time, I was in the middle of the pack. (Here is how I did in 2011.) Note that my projection without the turnout adjustment was off by 34, which is significantly better than average.

Three models did particularly well by this measure. If I understand correctly, Teddy on Politics injects a fair bit of personal judgment into his projections - they're not as heavily based on polls as the other projections. This time, he got things very right (which happens fairly often, if you've been following him, although there are also big misses), foreseeing which way the wind was blowing. The CVM election model appeared to have been lucky: the seat model was too extreme, but the Liberal support was underestimated, and the two errors canceled out. For example, it correctly projected the Liberal Atlantic sweep, but on percentages with which the sweep would not have happened (as the Liberals won some seats narrowly). And it incorrectly predicted an NDP wipeout in Ontario, which didn't even come close to happening.

This leaves us with The Signal, which did very well. A big part of this is that it correctly foresaw that the Liberal Québec vote would be quite efficient. Unfortunately, its description of methodology is not very detailed, so it is hard to see why exactly it did better than everyone else - especially given that it had the lowest Liberal popular vote projection. Still, congrats to The Signal!

Average absolute popular vote deviation for 5 parties
0.46 Sauder Prediction Market
0.84 Canadian Election Watch
0.94 The Signal
1.02 Teddy on Politics
1.16 Too Close to Call
1.30 ThreeHundredEight
1.40 CVM Election Model
1.48 Le calcul électoral

By this measure, I provided the best poll-based projection. The unadjusted numbers were off by 0.90 on average. The polls did quite well this time, and adjusting polls to reflect the latest trends helped. In fact, had my adjustments not been so cautious, the vote (and seat) projections would have been even more accurate - perhaps close to the almost-on-the-dot Sauder market prediction.

Speaking of the Sauder prediction market, feeding its popular vote prediction into most seat models would probably have produced more accurate seat counts than it predicted. This suggests that while market participants were good at guessing the overall vote, the seat count conversion proved challenging. This is why quantitative seat models help!

Number of ridings correctly projected
275 Teddy on Politics
269 ThreeHundredEight
268 Canadian Election Watch
261 Election Prediction Project

Friday, July 26, 2019

2015 Result Map

Here is a map of the results of the 2015 General Election. I am getting the blog ready for the 2019 General Election, so stay tuned!