Small-size matching R analysis for election booth and door-to-door campaigning in European Parliament elections 2019 in Berlin-Mitte
1. Introduction
The European Election in May 2019 resulted in an unexpectedly high share of Berlin citizens voting for the Green party. As great campaigning efforts of the local chapter fell together with an overall increase in vote share for the green, this short analysis seeks to disentangle the “campaigning” and the “atmosphere”effect. We analyse local district data including demographics, green vote share and a track of campaigning activities. The analysis requires more robust controls with a larger data set to solidify the effects.
1.1 Preparation & Data basis
Libraries: The used non-standard R libraries are “cobalt” and “Matchit”.
1.2 Loading, cleaning and formatting data
The data set is retrieved from the Statistical Bureau of Berlin-Brandenburg and is further enriched with district-specific campaigning activities, originating from the Green Party in Berlin-Mitte. The raw data of the campaigning activities cannot be disclosed. The relevant variables are “wkst”, a dummy for having held at least one election booth, and “htwk”, a dummy for having applied door-to-door campaigning in a certain district.
2. Analysis
2.1 Linear Regression: Continuous variables
Using demographic variables, as well as measures for election campainging in a naive approach, show some significant drivers of voting green throughout the districts. Among these are “share of
people >65”, “share of people with migration background”, and “people receiving welfare”. The effects are significant and relevant in size, yet their interpretability is limited due to the
naive approach taken (and looking only at correlation at this level).
To illustrate the potential strength of the effect: the average decrease in the vote share for the green party between a district that has 30%of its inhabitants being older than 65 compared to a
district with 50% 65+ share is 16 percentage points. The strength of the correlation can be taken from the stepth of the regression line in the scatter plots below.
2.2 Dummy variable: Former East-West Berlin district
The local district Berlin-Mitte nowadays consists of former districts of East and West Berlin. So, consulting for potential effects of the history of one district, we account for the dummy variable "OstWest" (EastWest).
As the summary shows, there is no significant difference between former East and former West districts. The means differ by 0.3 percentage points.
Nevertheless, the scatter plot reveals and interesting observation that manifests in a density layout:
Whereas results from former West-German district cluster around one peak at approx. 35 %, the former East districts are separate between those that perform very well (>40%) and those performing poorly (approx. 20%).
2.3 Dummy variable: Door-to-Door Campaigning
Some of the voting districts have been advertised with Door-to-Door campaigning, where campaigners rang at private households’ doorbells and informed citizens about the political positions of the Greens. We regard the effect of having campaigned door-to-door in certain districts.
Looking at the scatter plot, we see a distinct clustering for the dummy variable of door-to-door campaigning, i.e. those districts advertised with door-to-door campaigning and those not
advertised. To further analyse the potential effect of of DtD-campaigning,
we first calculate the Naive Average Treatment Effect and later advance the result via a Matching strategy.
2.3.1 Naive Average Treatment Effect
Comparing those districts that were targeted with Door-to-Door campaigning with those that weren't, we test the naive average treatment effect (NATE) for significance. The effect is highly significant and strong in size: Districts with Door-to-Door campaigning performed on average 10 percentage points higher when compare to those that weren't campaigned. Being skeptical about quick results drawn from regressions/NATE, we analyse the potential effect of Door-to-Door campaigning more thoroughly.
2.3.2 Matching
We apply a Nearest Neighbor Matching, based on the potentially characterizing variables: "OstWest" (Dummy), "Einwohner 18-65 Prozent", "Deutsche 18+ Familienstand verwitwet Prozent", "Einwohner unter 65 in SGB II 2017 Prozent", "Einwohner 65 und älter Prozent", "Deutsche 35-45 Prozent", "GRÜNE in %", "AfD in %".
We match 15 districts to our treated observations.
Post-matching Average Treatment Effect shows:
The average treatment effect of doing HTWK is 1.34 percentage points. The result is not significant (p-value 0.38187 > 0.05; probably because of the small sample size). It is likely that Door-to-Door campaigning was applied to particular Greens-affiliated household and hence shows only little effect on the voting turnout.
2.4 Dummy variable: Campaigning with Election booths ("Wahlkampfstände", wkst)
Some of the voting districts have been advertised with election booths, where campaigners informed about the political positions of the Greens. We regard the effect of having set up election booths in certain districts.
The scatter plot indicates a potential difference between those districts advertised with an election booth. To quantify the potential effect, we calculate the NATE.
2.4.1 Naive Average Treatment Effect (NATE)
The Naive Average Treatment Effect for campaigning with election booths is around 3.0 percentage points. This means that the Greens scored 3.0 percentage points higher on average in districts the Greens did advertise with election booths. The difference is significant and hence requires a more thorough analysis, e.g. by applying a Matching strategy.
2.4.2 Matching ("Wahlkampfstände", wkst/Anzahl Stände)
We identify the following interesting variables:
- Einwohner 65 und älter Prozent
- Deutsche 35 - 45 Prozent
- Deutsche 45 - 60 Prozent
- Deutsche 18+ Familienstand verheiratet Anzahl
- Deutsche 18+ Familienstand verwitet Prozent
- Deutsche 18+ Familienstand geschieden Prozent
- Einwohner unter 65 in SGB II 2017 Prozent
- Einwohner 65 und älter Prozent
- GRÜNE in %
- AfD in %
We apply a Nearest Neighbor Matching based on the above mentioned variables.
We run a t-test to check for robustness.
The average treatment effect of using election booths is: 0.02352. This means the districts with election booth campaigning scored approx. 2.4 percentage points higher than those without this
type of campaigning. The result is not significant (p-value
0.14025 > 0.05; probably because of the small sample size).
3. Conclusion
A more closer look the to the turnout for the Green Party in the 2019 election for the European parliament reveals some interesting insights into demographic variables of the electorate, such as
difference between former East and former West districts. Naive approaches indicate that door-to-door campaigning has strong effect on the voting share. Similar tendencies can be observed
with election booth campaigning but the effect is far smaller. When a matching strategy is applied both effects shrink in size and turn insignificant due to limited observations. This is likely
because of the targeted application of door-to-door and election booth campaigning in areas that are more affiliate to the party goals than others.
In general, both campaigning activities show a positive effect but suffer from the critically small data set and require additional analysis to increase robustness of the findings.