Sunday, August 4, 2019

The Not-Quite Uniform Swing Model for 2019

Update Aug. 9: I've made some changes (indicated in italics) to Section I. The model is slightly simplified and hopefully more robust.

Update Aug. 10: Added Edmonton Strathcona to Section II.e. This change does not affect any past projections.

Update Aug. 13: Added Kenora to Section II.d. This change flips Kenora to the Liberals as of Aug. 13. Also added a clarification at the end of Section I.

Update Aug. 15: Changed the adjustment for Victoria in Section II.a, and added Thunder Bay--Superior North to Section II.d. (see Sept. 8 update)

Update Aug. 18: Reviewed the adjustment for Skeena--Bulkley Valley in Section II.e.

Update Aug. 20: Added Longueuil--Saint-Hubert to Section II.c to reflect news.

Update Aug. 21: Reviewed the adjustment for Burnaby South (Section II.b) to offset new NDP turnout adjustment, and refined where the extra NDP vote comes from. Also added adjustments for Halifax, Sackville--Preston--Chezzetcook and Ottawa Centre (Section II.d) following the same approach as used for Skeena--Bulkley Valley in an earlier update.

Update Aug. 22: Added Windsor West to Section II.f. Adjustments further updated as of Sept. 8.

Update Sept. 8: Removed Thunder Bay--Superior North from Section II.d due to Bruce Hyer candidacy.

Update Sept. 16: Reviewed Lac-Saint-Jean adjustments.

Update Sept. 27: Added Section II.f. adjustment to Laurier--Saint-Marie.

Update Oct. 5: Expanded Section II.a. to include candidates ejected by their party or withdrawing after the 2019 nomination deadline, and added an adjustment for Burnaby North--Seymour.

As those of you familiar with this blog know, my seat projections are primarily based on assuming uniform swing within each of Canada's six "polling regions" (BC, AB, MB/SK, ON, QC, Atlantic) from the previous election to the current polling average. This post describes two categories of non-polling-based* changes that I am making to uniform swing.
*Other than comparing by-election results to contemporaneous polls for adjustments in II(b)

Part I is geeky (light math involved), while Part II looks at many specific races. They're independent from each other, so if you don't want to eat your veggies before dessert, skip to Part II!

[EDIT Aug. 5: Added Ottawa Centre to the list, though no adjustment there for now.]

I. Smaller Swings in Uncompetitive Ridings

As my post analyzing model performance for the 2015 election noted, uniform swing tends to shift party support too much when it is either very low or very high. This is not a new discovery, but due to the size of the swings in 2015, it actually mattered. Going to proportional swing doesn't solve the problem: while it would reduce shifts when party support is very low, it would increase them when party support is very high.

The starting point of my modification is the assumption that shifts between two parties are roughly proportional to the sizes of both parties - a sort of gravity model of voting. This isn't crazy: a shift from party A to party B has greater potential if A has more supporters and B has more salience. If there were only two parties, this means that shifts are biggest when the parties are at 50-50, still similarly big when they're at 70-30, but substantially smaller when the support levels are 80-20 or more extreme.

Of course, Canada has more than two major parties, so things are more complicated. First, for every pair of "significant" parties (big 5 in Québec, big 4 elsewhere)*, I multiply the two parties' pairwise vote shares. For example, if a riding is LIB 40%, CON 30%, NDP 20%, GRN 10%, then the three pairwise vote shares involving the Liberals for that riding are:
LIB-CON: (4/7)*(3/7) = 12/49
LIB-NDP: (4/6)*(2/6) = 2/9
LIB-GRN: (4/5)*(1/5) = 4/25

Then, I calculate a "coefficient of variation" for each party within each riding, equal to the average of the pairwise vote shares involving that party, weighted by the vote share of the other party in each pair. So in the example above, the Liberal coefficient of variation would be:
(30*(12/49) + 20*(2/9) + 10*(4/25))/60
This is approximately equal to 0.223. Note that the maximum possible coefficient is 0.25, which happens when a party is tied for first place, and any party not tied for the lead has no support. This corresponds to the intuition that competitive ridings may be especially fluid because many voters see the parties in contention as roughly equally good (or bad).

The swing I apply to a party's vote share in riding X is then its regional swing multiplied by its coefficient of variation in riding X, divided by its average coefficient of variation in the region. So if the Liberals have a coefficient of 0.22 in Abitibi--Baie James--Nunavik--Eeyou (it's my favourite riding name), an average coefficient of 0.2 in Québec, and increase by 5% in Québec, then I increase their vote share by 5%*0.22/0.2 = 5.5% in Abitibi--Baie James--Nunavik--Eeyou.

Then, for each riding, I proportionally increase or decrease the support levels so that they sum to the same number as they would if uniform swing were applied (i.e. 95-100% for most ridings).

Finally, I compute the average swing for each party within each region. This should be close to the target uniform swing. Any remaining gap is closed via uniform swing.

Now, a question arises: should the coefficients of variation be calculated based on support levels in the previous election, or hypothetical support levels after applying uniform swing (with a minimum threshold at, say, 2%)? Testing this on the shifts from 2011 to 2015 using actual regional support levels, I find that the variance of errors is minimized when calculating the coefficient both ways, and using 10% of the former and 90% of the latter. However, testing on a hypothetical reverse shift (if we were going back in time from 2015 to 2011) suggests that putting more weight on the former is better. Therefore, I have decided to proceed in a symmetric way, and simply take the arithmetic mean of the two. I also find that the results are best when they're pulled back 15% of the way toward what uniform swing would have given. (This crossed-out sentence no longer applies after the formula change.)

Next question: does this actually work? Again, testing this on the shifts from 2011 to 2015 using actual regional support levels, I find that, outside Québec, the variance of vote share errors is reduced by 10-20% around 10% for most parties relative to using uniform swing (with a minimum threshold). (Within Québec, it doesn't seem to make much of a difference.) The regional projected seat counts become slightly closer to the actual result on average. Québec is the main exception to the latter as this method would have overshot on the Liberal seat count by even more than uniform swing (57 instead of 47 vs. 40 actual), but half of the additional Liberal overshoot actually improves the riding-level accuracy (either the Liberal actually won the riding, or the Liberal was 2nd while uniform swing projected the 3rd place candidate as the winner). This problem no longer arises in Québec.

Now, since this change was inspired by observations made in 2015, one should worry that the method wouldn't work on another election. While I didn't do extensive testing, I did look at what would happen if one were to project the 2011 results on the 2015 to 2011 shifts (i.e. pretend that we're going back in time). Again, the variance of vote share errors is reduced, and the seat projection is improved both regionally and nationally. (though not by as much). Although the seat projection ends up slightly farther from the actuals, the number of ridings projected correctly slightly improves.

I will adopt this model for my 2019 projections, and assess its performance relative to uniform swing after the election.

*Clarification: For 2019, I am assigning a flat vote share of 2.4% for the People's Party of Canada (outside of Beauce) and of 0.6% for minor candidates (all candidates outside the six largest parties, except for star independents - currently Wilson-Raybould and Philpott). Thus the five large parties' support will always add up to 96.7-96.8% (97% minus adjustments for Bernier and star independents). This is to correct for the tendency of polls to overstate the support of minor parties and "others." Of course, if the People's Party takes off, I will review this assumption.

II. Adjustments to 2015 Riding Baselines

These adjustments do NOT include poll-based adjustments to swing, which will be discussed in a future post. At issue here are adjustments based on information about candidates and past election results. These are somewhat subjective, so let me know if you see something that you think is way off!

a) Party not running or withdrawing in a riding in 2015 or 2019

Labrador: Greens did not run in 2015. Baseline adjustments: CON -0.2, NDP -0.3, LIB -1, GRN +1.5. Obviously, this adjustment makes little difference.

Mississauga--Malton: Conservative candidate dropped by party in 2015 (after deadline, so still got many votes). Baseline adjustments: CON +2, NDP -0.3, LIB -1.4, GRN -0.3. Again, this adjustment makes little difference.

Kelowna--Lake Country: Greens did not run in 2015. Baseline adjustments: CON -1.3, NDP -2.3, LIB -5.7, GRN +9.3. This will make it even harder for the Liberals to keep the riding, which is already an uphill climb.

Victoria: Liberal candidate withdrew in 2015 (after deadline, so still got some votes). Baseline adjustments: CON -1.3, NDP -3.7, LIB +16.3, GRN -11.3. This is still a Green/NDP race with the Greens favoured, but the race is tighter. NDP -6.5, LIB +22.8, GRN -16.3. (This adjustment used to be a blend of different approaches, and was modified after new polling suggests that one of the approaches is more appropriate.) This is a tight three-way race.

Burnaby North--Seymour: The Conservative candidate has been ejected from her party, but will remain on the ballot, much like the Liberal candidate in Victoria in 2015. The Victoria adjustments imply that the Liberal candidate that dropped out still got about 1/3 of the vote that she would otherwise have received. For this year's situation, since Conservative voters are less likely to have a second choice, I assume that this candidate will retain 1/2 of the vote that she would have otherwise received - around 15% per the current projection. The remaining 15% is distributed among the other parties through a combination of two effects: (i) Tories actually voting for their second choice, and (ii) Tories staying home. Factor (i), according to second choice data in recent surveys, might slightly favour the PPC and the NDP over the Liberals and Greens - but it comes pretty close to an even four-way split. Factor (ii) will increase the vote share of parties proportionally, thereby favouring the Liberals and NDP. To obtain a result consistent with these assumptions, the baseline adjustment is: CON -14, NDP +6, LIB +5.5, GRN +2.5, and will be coupled with a 3-point increase for the PPC projection taken proportionally from other parties. These changes will be applied on projections where the midpoint date of the most recent poll is October 4 or later.

b) By-elections

CON holds (no adjustment)
Medicine Hat--Cardston--Warner
Calgary Heritage
Calgary Midnapore
Sturgeon River--Parkland
Leeds--Grenville--Thousand Islands and Rideau Lakes

LIB holds (no adjustment)

NDP hold
Burnaby South (Jagmeet Singh's riding): -1.5 CON, +5 +6 NDP, -5 -3.5 LIB, -1 GRN. This puts the Liberals further behind, making this riding an NDP/Conservative race (the People's Party candidate, who got over 10% in the by-election, is running in Alberta in the General Election).

Lac-Saint-Jean: -10 CON, -5 -8 NDP, +10 +13 LIB, +5 BQ. This is shaping up to be a Liberal/Conservative contest.
South Surrey--White Rock: No adjustment

Chicoutimi--Le Fjord: +25 CON, -10 NDP, -5 LIB, -10 BQ. This riding now looks solid for the Tories.

Outremont: No adjustment

Nanaimo--Ladysmith: -5 LIB, +5 GRN. This helps the Greens hold the NDP at bay.

c) MP changed affiliation and seeking re-election
- Beauce: rolled into poll-based adjustment
- Longueuil--Saint-Hubert: -3 NDP, +3 GRN. This is a relatively cautious adjustment, and will be increased if polls warrant. The riding remains a two-way LIB/BQ race.
- Aurora--Oak Ridges--Richmond Hill: No adjustment
- Markham--Stouffville: rolled into poll-based adjustment
- Vancouver Granville: rolled into poll-based adjustment
Raj Grewal (Brampton East) and Darshan Kang (Calgary Skyview), two Liberals turned Independent, have not yet announced (as far as I know) whether they will run again. If they do, there may be adjustments.

d) Star candidates or strong independent candidates that lost in 2015 and may not run in 2019
Note: If a candidate ends up running again, the corresponding adjustment will be cancelled.

Avalon (Scott Andrews): CON +6, NDP +6, LIB +5.5. This won't make much of a difference.

Egmont (Gail Shea): CON -10, LIB +10. Since the 1980s, this seat had consistently been around the PEI average or more Liberal until Gail Shea ran in 2008. The adjustment pushes the seat most (but not all) of the way toward the PEI average, and implies a competitive race.

Halifax (Megan Leslie) and Sackville--Preston--Chezzetcook (Peter Stoffer): No adjustment for now since there is no reasonable base (Leslie was preceded by Alexa McDonaugh, while the Sackville riding has been represented by Stoffer since its creation), but I will be less hesitant to adjust these ridings away from the NDP based on riding polls. Halifax: NDP -1, LIB +0.5, GRN +0.5. No new information, but I took another look at the situation by examining how things have evolved since Leslie inherited the riding from McDonaugh in 2008. There's plenty of uncertainly as to the size of the Leslie effect (it could be a lot larger than the adjustment implies), but I went with the most cautious method. Sackville--Preston--Chezzetcook: NDP -2, LIB +2. No new information either, but similarly, I took a look at how things have evolved since the riding was created in 1997. This adjustment is very cautious: while it looks like Peter Stoffer had a large positive effect on the NDP vote, most of it came between 1997 and 2004. So it's possible that most of those voters became habitual NDP supporters, or that the effect gradually dissipated, and Stoffer's vote held up due to other factors. In both cases, we have tight races (three-way in Sackville--Preston--Chezzetcook, LIB/NDP in Halifax), though I wouldn't be surprised if constituency polls end up dramatically increasing the adjustment and favouring the Liberals.

Avignon--La Mitis--Matane--Matapédia (Jean-François Fortin): NDP +3.7, LIB +3.7, GRN +0.5, BQ +3.7. This won't make much of a difference.

Laurier--Sainte-Marie (Gilles Duceppe): NDP +2, LIB +1, BQ -3. This will reduce the odds of the Bloc retaking this riding.

Pierre-Boucher--Les Patriotes--Verchères (JiCi Lauzon): NDP +1, LIB +1, GRN -6, BQ +4. This will increase the odds of the Bloc keeping this riding.

Ottawa Centre (Paul Dewar): Somewhat surprisingly, historical patterns do not suggest Dewar did better electorally than an "average" New Democrat would have done in this riding. Therefore, there is no adjustment for now, but I will be less hesitant to adjust these ridings away from the NDP based on riding polls. NDP -2.5, LIB +2, GRN +0.5. No new information, but I took another look at the situation by examining how things have evolved since Dewar inherited the riding from Broadbent in 2006, and, if you squint, Dewar 2015 seems slightly more popular than Dewar 2006.

Eglinton--Lawrence (Joe Oliver): CON -1, LIB +1. This will increase the odds of the Liberals keeping this riding.

Renfrew--Nipissing--Pembroke (Hec Clouthier): CON +5, NDP +1, LIB +5. High uncertainty here (vote pattern suggests Clouthier's vote came from Tories, but he's an ex-Liberal), but it doesn't matter much because this is a very conservative riding (one of two ON Canadian Alliance seats in 2000).

Thunder Bay--Superior North (Bruce Hyer): CON +3, NDP +5, GRN -8. Hyer was elected as an NDP MP in 2008 and 2011, but ran under the Green banner in 2015. This adjustment makes the NDP stay the main contender to the (still favoured) Liberals. (Update Sept. 8: Hyer is running again per this article.)

Kenora (Howard Hampton): CON +2, NDP -4, LIB +2. This will increase the odds of the Liberals keeping this riding.

Dauphin--Swan River--Neepawa (Inky Mark): CON +4.5, NDP +0.5, LIB +2.5, GRN +0.5. Still safe Conservative.

St. Albert--Edmonton (Brent Rathberger): CON +5, NDP +3, LIB +10, GRN +1.5. Still safe Conservative.

e) Star MPs retiring in 2019

Cumberland--Colchester (Bill Casey): CON +10, LIB -10. This traditionally Conservative seat becomes a very likely Tory pickup (likeliest in Atlantic Canada outside NB).

Kings--Hants (Scott Brison): CON +5, LIB -5. This is maybe half the voters Scott Brison "took with him" from the PCs to the Liberals back in the day - the assumption being that the other half may stay in the Liberal camp now that it has been so long. The adjustment makes this riding more like West Nova and potentially competitive.

Edmonton Strathcona (Linda Duncan): NDP -3, LIB +3. Linda Duncan has been around for a long time, so this adjustment is pretty cautious.

Skeena--Bulkley Valley (Nathan Cullen): No adjustment for now since there is no reasonable base (this riding has been represented by Cullen since its creation), but I will be less hesitant to adjust this riding away from the NDP based on riding polls. CON +1.6, NDP -5, LIB +2.9, GRN +0.5. No new information, but I took another look at the situation by examining how things have evolved since the riding was created in 2004, and Cullen 2015 seemed quite a bit more popular than Cullen 2004. Thus, an adjustment is in order even without comparing to pre-Cullen days. This adjustment is cautious: much like in Kings--Hants, I'm going with just half of the apparent effect due to how long the retiring MP has been around.

f) New star candidates running in 2019

Laurier--Sainte-Marie (Steven Guilbault): No (further - see above) adjustment for now, but this is a riding where I will be less hesitant to adjust in the Liberal direction based on riding polls. And we finally have a riding poll in this riding, which suggests a sizeable Guilbeault effect: NDP -7, LIB +9, GRN -1, BQ -1. (Combined with the Duceppe adjustment above, it's NDP -5, LIB +10, GRN -1, BQ -4 in this riding.)

Windsor West (Sandra Pupatello): NDP -12.5 -11.3, LIB +14 +12.8, GRN -1.5 to 2015 baseline, and additionally NDP +1.25, LIB +1.25, GRN -3 -2.5 to projection. The reason for a two-step adjustment is that I couldn't fully adjust the Greens in the baseline without going negative. This adjustment is guided by this poll (but is classified as a model rather than poll-based adjustment since it's clearly due to candidate's star status). This riding goes from safe NDP to a close NDP/LIB race. Update Sept. 8: Adjustments modified to avoid double-counting regional adjustment to Southwestern ON.

