Update Aug. 11: Change in the common pollster variance used to derive confidence intervals. New value underlined.
Update Aug. 13: Major change in how I am presenting projections. Changes underlined. I have also added the first paragraph containing information about inputs into the model.
Update Aug. 21: Change to turnout adjustment. Changes underlined.
Update Aug. 26: Separate polling weights are now derived for each region. Link to relevant post added.
Update Sept. 13: Belatedly added a missing link to Round 2 of polling-related adjustments.
Update Oct. 11: Belatedly added missing links to Rounds 3 and 4 of polling-related adjustments.
My model is relatively parsimonious. It uses the following information:
- Results of the last General Election and by-elections since then
- Results of earlier elections, but ONLY for "special cases" (see part II of this post) and to calibrate model parameters (e.g. turnout adjustment (as discussed below), uncertainty)
- Polling data
In particular, the model does NOT use the following information:
- Election results before the last General Election, except for special cases and calibration
- Incumbency effect, except when a star MP leaves, which is treated as a "special case" (incumbency effect tends to be insignificant for low-profile politicians, unlike in the U.S.)
- Demographic data
- Data about online activity (e.g. search data, social media followers, etc.)
Despite using a limited set of data, I have a good track record relative to others making seat projections: see the 2011 and 2015 comparisons. I have also enhanced the model in several ways for the 2019 election (see the links below), so I hope to do well again!
The projections are made on vote shares after the following turnout adjustments:
CON: +1.5 pp
NDP: -1 pp
LIB: +0.5 pp
GRN: -1 pp
This may explain why my projection is more favourable to the Conservatives than other projections. The rule of thumb in this election appears to be, very roughly, 1 point nationwide = 10 seats (CLARIFICATION: this is for the gap between the two main parties).
The turnout adjustments are based on how election results compared with the final polls in recent federal general elections. The Conservatives have consistently outperformed the polls: by about 1 point in 2015 (they did worse seat-wise because the Liberals also outperformed, and did so very efficiently), and by several points in 2008 and 2011. The Liberals roughly matched the final polls in 2008 and 2011, and outperformed them by 1-2 points in 2015. The Greens have consistently underperformed by 1 point or more. The NDP has usually somewhat underperformed as well,
In the interest of transparency (and geekiness), here are the posts describing the methodology for these seat projections. Many of these features are new for 2019!
Poll inclusion criteria
How national polling weights are derived (see this post for an illustration)
How regional polling weights are derived
General modification to uniform swing and riding-level adjustments to the 2015 baseline
Polling-related adjustments: Round 1, Round 2, Round 3, Round 4
The seat numbers on the left are the number of ridings where each party is projected ahead,
The shortcut I take to estimate the expected number of seats will lead to a slight underestimation of parties that are disproportionately often competitive despite being third or fourth. My simple method is explained here:
Estimates for expected number of seats
Confidence intervals, provided from time to time, are crudely estimated rather than based on simulations. This is merely to save myself some work - ideally, they should be based on simulations. I differentiate between intervals for a hypothetical election happening at the time of the latest poll (specifically, the midpoint) and those for the actual election. Ranges for the latter are wider because public opinion may shift. All intervals are speculative; for large parties, I round them to the nearest 5 to draw attention to their very rough nature.
The assumptions behind the confidence intervals are the same as those behind poll weighting. In addition, I assume a variance of
Some other projection websites also provide confidence intervals, often using simulations, which is the way it should ideally be done. However, simulations can still be junk if they're miscalibrated. Other sites aren't always clear about the timing of the hypothetical election for which confidence intervals are given. If they're based on how accurate the last polls before an election have historically been, the confidence intervals would be for a hypothetical election taking place just after the latest poll. They should therefore be wider than the confidence intervals I give for a hypothetical election taking place during the latest poll, and narrower than the ones I give for the actual election (except just before the election, when they should be similar to the latter). If another projection site gives confidence intervals for an election today/tomorrow narrower than those that I give for an election as of the last poll, then the other site is probably not taking the uncertainty seriously enough.
No comments:
Post a Comment