Ben Namovicz
On November 9, 2020, Trump campaign advisor Steve Cortes published an article titled The Statistical Case Against Biden's Win. Cortes later adapted this article into a series of twitter videos, one of which was retweeted by President Trump. The article claims to demonstrate "intense improbability of the accuracy of the present Biden lead" using statistics. [0]
According to Cortes, there were unusual results in specific swing states that favored Biden. Cortes thinks it would be highly unlikely to see results like these in a fair election. The central thesis revolves around the idea that the results in swing states were significantly different from the rest of the country, but Cortes only ever compares these swing states to one or two other non-swing states. I will compare the results in swing states to the entire rest of the country to see if they were really unusual after all.
In this notebook I will go through the entire process from data collection to data cleaning to data analysis. First I will collect data on the 2020 presidential election, past presidential elections, the 2020 senate election, and geographical information on US counties. Then I will tidy the data so that different datasets can be combined. Finally I will produce data visualizations to investigate the statistical claims Cortes makes.
import numpy as np
import pandas as pd
import geopandas as gp
import time
from selenium import webdriver
import hvplot.pandas
from bokeh.models import HoverTool
from holoviews import opts
base_url = 'https://www.politico.com/2020-election/results/'
driver_path = 'C:\\chromedriver_win32\\chromedriver.exe'
def president_url(state):
return base_url + state.replace(' ', '-').lower()
def senate_url(state):
if(state == 'Arizona'):
return base_url + state.replace(' ', '-').lower() + '/senate-special/'
return base_url + state.replace(' ', '-').lower() + '/senate'
def get_data(driver):
time.sleep(1)
elem = driver.find_element_by_xpath("//div[contains(@class, 'county-results')]")
time.sleep(2)
try:
button = elem.find_element_by_tag_name('button')
time.sleep(1)
button.click()
time.sleep(4)
except:
pass
html_table = elem.find_element_by_tag_name('table')
time.sleep(1)
table = pd.read_html(html_table.get_attribute('outerHTML'))[0]
return table
def get_DC_data(driver):
time.sleep(1)
elem = driver.find_element_by_xpath("//div[contains(@class, 'results-table')]")
time.sleep(1)
html_table = elem.find_element_by_tag_name('table')
time.sleep(1)
table = pd.read_html(html_table.get_attribute('outerHTML'))[0]
return table
#Finds out which party the winning candidate is in for a Senate race, renames columns accordingly
def find_parties(driver, df):
header = driver.find_element_by_tag_name('h1')
winner = header.get_attribute('innerHTML')
if('(D)' in winner):
df.columns = ['County', 'democrat votes', 'democrat pct', 'republican votes', 'republican pct']
else:
df.columns = ['County', 'republican votes', 'republican pct', 'democrat votes', 'democrat pct']
return df
pres_data_2020 = pd.DataFrame(columns=['County', 'Biden votes', 'Biden pct', 'Trump votes', 'Trump pct', 'State'])
states = ['Alabama', 'Arizona', 'Arkansas', 'California', 'Colorado', 'Connecticut', 'Delaware', 'Florida', 'Georgia',
'Hawaii', 'Idaho', 'Illinois', 'Indiana', 'Iowa', 'Kansas', 'Kentucky', 'Louisiana', 'Maine', 'Maryland', 'Massachusetts',
'Michigan', 'Minnesota', 'Mississippi', 'Missouri', 'Montana', 'Nebraska', 'Nevada', 'New Hampshire', 'New Jersey',
'New Mexico', 'New York', 'North Carolina', 'North Dakota', 'Ohio', 'Oklahoma', 'Oregon', 'Pennsylvania', 'Rhode Island',
'South Carolina', 'South Dakota', 'Tennessee', 'Texas', 'Utah', 'Vermont', 'Virginia', 'Washington', 'West Virginia',
'Wisconsin', 'Wyoming']
driver = webdriver.Chrome(executable_path=driver_path)
time.sleep(5)
#Takes a few minutes to run
for state in states:
try:
time.sleep(1)
driver.get(president_url(state))
table = get_data(driver)
table['State'] = state
pres_data_2020 = pd.concat([pres_data_2020, table])
except:
print('No county data for ', state)
#Gets the DC data
try:
time.sleep(1)
driver.get(president_url('Washington DC'))
DC_table = get_DC_data(driver)
pres_data_2020 = pres_data_2020.append({'County':'Washington', 'Biden votes':DC_table.iloc[0, 1],
'Biden pct':DC_table.iloc[0, 2], 'Trump votes':DC_table.iloc[1, 1],
'Trump pct':DC_table.iloc[1, 2], 'State':'District of Columbia'}, ignore_index=True)
except:
print('No data for Washington DC')
driver.quit()
I found complete county level data for all presidential elections from 2000 to 2016 at the Harvard Dataverse. I have edited this file slightly:
pres_data_historical = pd.read_csv('countypres_2000-2016.csv')
senate_data_2020 = pd.DataFrame()
senate_states = ['Alabama', 'Arizona', 'Arkansas', 'Colorado', 'Delaware', 'Georgia', 'Idaho',
'Illinois', 'Iowa', 'Kansas', 'Kentucky', 'Maine', 'Massachusetts', 'Michigan', 'Minnesota',
'Mississippi', 'Montana', 'Nebraska', 'New Hampshire', 'New Jersey', 'New Mexico', 'North Carolina',
'Oklahoma', 'Oregon', 'Rhode Island', 'South Carolina', 'South Dakota', 'Tennessee', 'Texas',
'Virginia', 'West Virginia', 'Wyoming']
driver = webdriver.Chrome(executable_path=driver_path)
time.sleep(5)
#Takes a few minutes to run
for state in senate_states:
try:
time.sleep(1)
driver.get(senate_url(state))
table = get_data(driver)
table = find_parties(driver, table)
table['State'] = state
senate_data_2020 = pd.concat([senate_data_2020, table])
except:
print('No senate data for ', state)
driver.quit()
I got the geographic boundaries for counties from census.gov. The census also has population and citizen voting age population (CVAP) for every county on election years. As of writing this 2020 data is not yet available so I used 2019 data in its place. For 2012 I edited the file to fix Oglala Lakota County, SD like in the historical data. I also renamed the CVAP files, they were all originally called 'County.csv' or 'county.sas7bdat'.
county_geog = gp.read_file('cb_2018_us_county_500k')
cvap_2012 = pd.read_csv('County_CVAP_2012.csv')
cvap_2016 = pd.read_csv('County_CVAP_2016.csv')
cvap_2019 = pd.read_sas('County_CVAP_2019.sas7bdat', format = 'sas7bdat', encoding="latin-1")
First I define a few useful functions for data cleaning. Then I get the data ready for analysis. I will have to combine different datasets so I can compare data between them. Data from politico doesn't have FIPS Identifiers for counties, so I have to match them by name to counties in the historical data. Making sure county names matched exactly proved complicated. [1]
def fix_pct(pct):
return (float(pct.replace('%', '')))
#Determines the total votes based on the vote totals and percents for each party
def sum_votes(row):
two_party_votes = row['democrat votes'] + row['republican votes']
two_party_pct = row['democrat pct'] + row['republican pct']
#Percents given are not exact, so this number will not be exact
return int(100*two_party_votes/two_party_pct)
virginia_cities = ['Alexandria', 'Bristol', 'Buena Vista', 'Charlottesville', 'Chesapeake', 'Colonial Heights', 'Covington',
'Danville', 'Emporia', 'Fairfax', 'Falls Church', 'Franklin', 'Fredericksburg', 'Galax', 'Hampton',
'Harrisonburg', 'Hopewell', 'Lexington', 'Lynchburg', 'Manassas', 'Manassas Park', 'Martinsville',
'Newport News', 'Norfolk', 'Norton', 'Petersburg', 'Poquoson', 'Portsmouth', 'Radford', 'Richmond',
'Roanoke', 'Salem', 'Staunton', 'Suffolk', 'Virginia Beach', 'Waynesboro', 'Williamsburg', 'Winchester']
#List of independent cities that share a name with a county in Virginia
virginia_double = ['Bedford', 'Fairfax', 'Franklin', 'Richmond', 'Roanoke']
county_replacements = {
'Desoto': 'DeSoto',
'LaSalle': 'La Salle',
'Lac Qui Parle': 'Lac qui Parle',
'Dona Ana': 'Doña Ana',
'Dewitt': 'DeWitt',
'Saint Louis': 'St. Louis',
'District of Columbia': 'Washington'
}
def is_city(county, state):
return ((state == 'Maryland' and county.lower() == 'baltimore city')
or (state == 'Missouri' and county.lower() == 'st. louis city')
or (state == 'Nevada' and county.lower() == 'carson city')
or (state == 'Virginia' and county.replace(' city', '') in virginia_double and ' city' in county)
or (state == 'Virginia' and county.replace(' city', '') in virginia_cities and
county.replace(' city', '') not in virginia_double))
def find_county_type(county, state):
if state == 'DC' or state == 'District of Columbia':
return 'District'
elif state == 'Louisiana':
return 'Parish'
elif is_city(county, state):
return 'City'
else:
return 'County'
#Removes ' county' from the end of county names
def remove_county(county):
#Preserve the names of 'Charles City County' and 'James City County'
if('Charles City' in county or 'James City' in county):
return county.replace(' County', '')
else:
county = county.replace(' County', '')
county = county.replace(' Parish', '')
county = county.replace(' city', '')
county = county.replace(' City', '')
return county
def fix_county(county, state):
county_type = find_county_type(county, state)
county_name = remove_county(county)
if county_name in county_replacements:
county_name = county_replacements[county_name]
if county_type == 'District':
return county_name
return county_name + ' ' + county_type
#Matches a row of the 2020 dataset to the FIPS of the same county in the historical dataset
def find_FIPS(county, state):
lines = pres_data_historical.loc[pres_data_historical['state'] == state]
lines = lines.loc[lines['county'] == county]
if lines.empty:
print("cannot find", county, state)
return np.NaN
line = lines.iloc[0]
return int(line['FIPS'])
#Turns CVAP GEOID into FIPS by removing '05000US' from the front
def fix_FIPS(GEOID):
return int(GEOID[7:])
pres_data_2020 = pres_data_2020.rename(columns={"County": "county", "State": "state",
'Biden votes': 'democrat votes', 'Trump votes': 'republican votes',
'Biden pct': 'democrat pct', 'Trump pct': 'republican pct'})
pres_data_2020['year'] = 2020
pres_data_2020['democrat pct'] = pres_data_2020['democrat pct'].apply(fix_pct)
pres_data_2020['republican pct'] = pres_data_2020['republican pct'].apply(fix_pct)
pres_data_2020['total votes'] = pres_data_2020.apply(sum_votes, axis=1)
pres_data_2020['county'] = pres_data_2020.apply(lambda x: fix_county(x.county, x.state), axis=1)
pres_data_2020['democrat votes'] = pres_data_2020['democrat votes'].apply(int)
pres_data_2020['republican votes'] = pres_data_2020['republican votes'].apply(int)
pres_data_historical = pres_data_historical.query("party == 'republican' or party == 'democrat'")
pres_data_historical = pres_data_historical.drop(columns=['state_po', 'office', 'candidate', 'version'])
#Combines republican votes and democrat votes into one row
index = ['year', 'state', 'county', 'totalvotes', 'FIPS']
pres_data_historical = pd.pivot_table(pres_data_historical, index=index, columns='party', values='candidatevotes')
pres_data_historical = pres_data_historical.reset_index()
pres_data_historical.columns.names = [''] #This is needed to finish resetting the index
pres_data_historical = pres_data_historical.rename(columns={'totalvotes': 'total votes',
'democrat': 'democrat votes', 'republican': 'republican votes'})
pres_data_historical['democrat pct'] = 100*pres_data_historical['democrat votes']/pres_data_historical['total votes']
pres_data_historical['republican pct'] = 100*pres_data_historical['republican votes']/pres_data_historical['total votes']
pres_data_historical['county'] = pres_data_historical.apply(lambda x: fix_county(x.county, x.state), axis=1)
senate_data_2020 = senate_data_2020.rename(columns={"County": "county", "State": "state"})
senate_data_2020['democrat pct'] = senate_data_2020['democrat pct'].apply(fix_pct)
senate_data_2020['republican pct'] = senate_data_2020['republican pct'].apply(fix_pct)
senate_data_2020['total votes'] = senate_data_2020.apply(sum_votes, axis=1)
senate_data_2020['county'] = senate_data_2020.apply(lambda x: fix_county(x.county, x.state), axis=1)
senate_data_2020['democrat votes'] = senate_data_2020['democrat votes'].apply(int)
senate_data_2020['republican votes'] = senate_data_2020['republican votes'].apply(int)
senate_data_2020['FIPS'] = senate_data_2020.apply(lambda x: find_FIPS(x.county, x.state), axis=1)
county_geog['STATEFP'] = county_geog['STATEFP'].apply(int)
#Moves Hawaii closer to the continental US on the map
move_HI = county_geog.loc[county_geog['STATEFP'] == 15]
move_HI['geometry'] = move_HI['geometry'].translate(45, 5)
county_geog = county_geog.loc[county_geog['STATEFP'] != 15]
county_geog = county_geog.append(move_HI)
#Removes Alaska and US territories
county_geog = county_geog.loc[county_geog['STATEFP'] != 2]
county_geog = county_geog.loc[county_geog['STATEFP'] < 60]
county_geog = county_geog.drop(columns=['STATEFP', 'COUNTYFP', 'NAME', 'LSAD', 'AFFGEOID', 'COUNTYNS', 'AWATER'])
county_geog = county_geog.rename(columns={'GEOID':'FIPS', 'ALAND':'area'})
county_geog['FIPS'] = county_geog['FIPS'].apply(int)
#Converts area from square meters to square miles
county_geog['area'] = 0.00000038610*county_geog['area']
<ipython-input-10-ae5dd1e729f0>:5: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy move_HI['geometry'] = move_HI['geometry'].translate(45, 5)
cvap_2019.columns = cvap_2012.columns
cvap_2012['year'] = 2012
cvap_2016['year'] = 2016
cvap_2019['year'] = 2019
cvap = pd.concat([cvap_2012, cvap_2016, cvap_2019])
cvap['GEOID'] = cvap['GEOID'].apply(fix_FIPS)
cvap = cvap.loc[cvap['LNNUMBER'] == 1] #Selects overall population
cvap = cvap.drop(columns=['GEONAME', 'LNTITLE', 'LNNUMBER', 'TOT_MOE', 'ADU_EST', 'ADU_MOE', 'CIT_EST', 'CIT_MOE', 'CVAP_MOE'])
cvap = cvap.rename(columns={'GEOID': 'FIPS', 'TOT_EST': 'population', 'CVAP_EST': 'CVAP'})
#Combines CVAP data from 2012, 2016, and 2019 into one row
cvap = pd.pivot_table(cvap, index='FIPS', columns='year', values=['population', 'CVAP'])
cvap = cvap.reset_index()
#Fixes the names of columns
cvap.columns = cvap.columns.droplevel(1) + ' ' + cvap.columns.droplevel(0).map(lambda x: str(x))
cvap.columns = cvap.columns.map(lambda x: x.strip())
county_geog = county_geog.merge(cvap, on='FIPS')
county_geog = county_geog.dropna()
state_geog = county_geog.merge(pres_data_historical.loc[pres_data_historical['year'] == 2016], on='FIPS')
state_geog = state_geog.dissolve(by='state', aggfunc='sum')
state_geog = state_geog.reset_index()
state_geog = state_geog.loc[:, ['state', 'geometry', 'population 2012', 'population 2016', 'population 2019',
'CVAP 2012', 'CVAP 2016', 'CVAP 2019', 'area']]
#Reduces the resolution of the county and state borders so maps load faster
county_geog['geometry'] = county_geog['geometry'].simplify(.005)
state_geog['geometry'] = state_geog['geometry'].simplify(.005)
2020 presidential data combined with historical presidential data
pres_data_2020['FIPS'] = pres_data_2020.apply(lambda x: find_FIPS(x.county, x.state), axis=1)
pres_counties = pd.concat([pres_data_historical, pres_data_2020])
pres_counties['margin'] = pres_counties['republican pct'] - pres_counties['democrat pct']
pres_counties['margin direction'] = pres_counties['margin'].apply(lambda x: 'R' if x > 0 else 'D')
pres_counties['absolute margin'] = pres_counties['margin'].apply(abs)
pres_states = pres_counties.groupby(['state', 'year']).sum()
pres_states = pres_states.reset_index()
pres_states.columns.names = ['']
pres_states = pres_states.dropna()
pres_states['democrat pct'] = 100*pres_states['democrat votes']/pres_states['total votes']
pres_states['republican pct'] = 100*pres_states['republican votes']/pres_states['total votes']
pres_states['margin'] = pres_states['republican pct'] - pres_states['democrat pct']
pres_states['margin direction'] = pres_states['margin'].apply(lambda x: 'R' if x > 0 else 'D')
pres_states['absolute margin'] = pres_states['margin'].apply(abs)
2020 presidential and senate election results combined by county and state
pres_data_2020['race'] = 'pres'
senate_data_2020['race'] = 'sen'
counties_2020 = pd.concat([pres_data_2020, senate_data_2020])
counties_2020 = counties_2020.drop(columns=['year'])
counties_2020['margin'] = counties_2020['republican pct'] - counties_2020['democrat pct']
counties_2020['margin direction'] = counties_2020['margin'].apply(lambda x: 'R' if x > 0 else 'D')
counties_2020['absolute margin'] = counties_2020['margin'].apply(abs)
states_2020 = counties_2020.groupby(['state', 'race']).sum()
states_2020 = states_2020.reset_index()
states_2020.columns.names = ['']
states_2020 = states_2020.dropna()
states_2020['democrat pct'] = 100*states_2020['democrat votes']/states_2020['total votes']
states_2020['republican pct'] = 100*states_2020['republican votes']/states_2020['total votes']
states_2020['margin'] = states_2020['republican pct'] - states_2020['democrat pct']
states_2020['margin direction'] = states_2020['margin'].apply(lambda x: 'R' if x > 0 else 'D')
states_2020['absolute margin'] = states_2020['margin'].apply(abs)
national_2020 = states_2020.loc[states_2020['state'].isin(senate_states)]
national_2020 = national_2020.groupby('race').sum().reset_index()
national_2020['turnout'] = national_2020['total votes']/1000000
national_2020['margin'] = national_2020['republican votes'] - national_2020['democrat votes']
national_2020['margin'] = 100*national_2020['margin']/national_2020['total votes']
national_2020['margin direction'] = national_2020['margin'].apply(lambda x: 'R' if x > 0 else 'D')
national_2020['absolute margin'] = national_2020['margin'].apply(abs)
Cortes provides four pieces of evidence to support his thesis of election improbability. I will investigate each of these claims separately, and evaluate them on their merits. They are:
The first claim Cortes makes is that Wisconsin saw implausibly high turnout. Wisconsin had a turnout rate of 89% registered voters. This number can be misleading: Wisconsin usually calculates turnout as a share of eligible voter due to its same day voter registration. The turnout rate as a share of eligible voters was 72.3%, only a little higher than the 2016 turnout of 67.3%. Cortes misleadingly compares the registered voter turnout, exaggerated as "over 90%", to Australia's eligible voter turnout of 92%.
Cortes calls the 84% turnout in Milwaukee, WI "suspect" compared with only 51% turnout in the similar city of Cleveland, OH. It appears this 84% was calculated as a share of eligible voters unlike his calculation for Wisconsin overall. Turnout in Cleveland was 51%, calculated as a share of registered voters because Ohio does not have same day voter registration. The comparison here is misleading. Both Milwaukee and Cleveland are cities within counties. The surrounding suburbs in Milwaukee County and Cuyahoga County have much higher turnout than Milwaukee City and Cleveland City. The 84% turnout Cortes cites is for the entire Milwaukee County, while The 51% turnout in Cleveland is just the city. Cuyahoga County had an overall turnout of 71%.
Any comparison between state turnouts is going to be flawed. Different states have different rules for registration, different rules for voting, and different methods to calculate turnout. I will compare states and counties using two more meaningful measures:
Change in turnout from last election: This method is useful for determining whether anywhere had an unusual spike in voting. States generally maintain voting rules year to year,[2] and this measure controls for that. A drawback is that this fails to account for changing population size.
Turnout as a percent of voting age citizens: This allows for a consistent measure of turnout between states, and as population size changes. This method doesn't account for different registration and voting rules between states.
#Gets turnout data from the overall dataset
county_turnout = pres_counties.loc[pres_counties['year'] > 2015]
county_turnout = pd.pivot_table(county_turnout, index=['state', 'county', 'FIPS'], columns='year', values='total votes')
county_turnout = county_turnout.reset_index()
county_turnout = county_turnout.rename(columns = {2016:'total votes 2016', 2020:'total votes 2020'})
county_turnout = county_geog.merge(county_turnout, on='FIPS')
#Calculates increase in turnout
county_turnout['total increase'] = county_turnout['total votes 2020'] - county_turnout['total votes 2016']
county_turnout['pct increase'] = 100*county_turnout['total increase']/county_turnout['total votes 2016']
#Calcualtes turnout as a percent of citizen voting age population
county_turnout['pct turnout 2016'] = 100*county_turnout['total votes 2016']/county_turnout['CVAP 2016']
county_turnout['pct turnout 2020'] = 100*county_turnout['total votes 2020']/county_turnout['CVAP 2019']
#Gets state data
state_turnout = pres_states.loc[pres_states['year'] > 2015]
state_turnout = pd.pivot_table(state_turnout, index=['state'], columns='year', values='total votes')
state_turnout = state_turnout.reset_index()
state_turnout = state_turnout.rename(columns = {2016:'total votes 2016', 2020:'total votes 2020'})
state_turnout = state_geog.merge(state_turnout, on='state')
#Calculates increase in turnout
state_turnout['total increase'] = state_turnout['total votes 2020'] - state_turnout['total votes 2016']
state_turnout['pct increase'] = 100*state_turnout['total increase']/state_turnout['total votes 2016']
#Calcualtes turnout as a percent of citizen voting age population
state_turnout['pct turnout 2016'] = 100*state_turnout['total votes 2016']/state_turnout['CVAP 2016']
state_turnout['pct turnout 2020'] = 100*state_turnout['total votes 2020']/state_turnout['CVAP 2019']
#Calculates national turnout
national_turnout = pres_counties.groupby('year').sum().reset_index()
#calculate turnout in millions of votes
national_turnout['turnout'] = national_turnout['total votes']/1000000
#Sets default options for maps
opts.defaults(
opts.Polygons(xaxis='bare', yaxis='bare', line_width=0.5, hover_line_color='darkgoldenrod'))
#Creates state borders map to emphasize state borders in national maps of counties
state_borders = state_geog.loc[:, ['geometry']]
borders = state_borders.hvplot(c='clear', geo=True, line_width=.75, hover_line_color=None)
To start I plot national turnout over time for context
hover = HoverTool(tooltips=[('', 'Turnout: @turnout million')])
national_turnout.hvplot.bar(x='year', y='turnout', xlabel='Year', ylabel='Turnout (millions)',
title= 'National Turnout from 2000 to 2020', hover_cols='turnout', tools=[hover])
National turnout rose from 136 million in 2016 to 157 million in 2020, a 15% increase. We can see how this varies between states.
hover = HoverTool(tooltips=[('', '@state'),
('', 'Increase in Turnout: @pct_increase%')])
hover_cols=['state', 'pct increase']
state_turnout.hvplot(c='pct increase', geo=True, title='Increase in Turnout from 2016 to 2020 by State',
tools=[hover], hover_cols=hover_cols, cmap='greens', clim=(0, 32))
As you can see, Wisconsin hardly stands out. It's 10% increase in turnout is actually less than the national average. It appears Ohio is the more unusual case, with one of the lower increases in turnout. Cortes' compared two states with relatively small increases turnout in order to argue that one of those state's turnout grew improbably quickly.
Let's move down to the county level to see if the counties Cortes highlighted in Wisconsin and Ohio were unusual.
hover = HoverTool(tooltips=[('', '@county, @state'),
('', 'Increase in Turnout: @pct_increase%')])
hover_cols=['state', 'county', 'pct increase']
wisconsin_turnout = county_turnout.loc[county_turnout['state'] == 'Wisconsin']
wisconsin_turnout.hvplot(c='pct increase', geo=True, title='Wisconsin Increase in Turnout fron 2016 to 2020',
tools=[hover], hover_cols=hover_cols, cmap='PiYG', clim=(-25, 25))
ohio_turnout = county_turnout.loc[county_turnout['state'] == 'Ohio']
ohio_turnout.hvplot(c='pct increase', geo=True, title='Ohio Increase in Turnout from 2016 to 2020',
tools=[hover], hover_cols=hover_cols, cmap='PiYG', clim=(-25, 25))
Milwaukee County, which Cortes calls "suspect" saw a modest increase of 4% since 2016. This is higher than Cuyahoga County at 3%, but lower than the Wisconsin state average of 10%, or any other county in the state. Between the two states, three counties stand out for large increases in turnout: Menominee County in Wisconsin is a small, mostly Native American county of just 4,556 people. Delaware and Union Counties in Ohio were both won by Trump. No other county in either state saw turnout rise by more than 20%.
We can compare these with the rest of the counties in the US to see that there is wide variation in how much turnout changed between regions. Cortes' assertion that turnout was abnormal "only in the key swing states that Biden allegedly won" doesn't hold water.
hover = HoverTool(tooltips=[('', '@county, @state'),
('', 'Increase in Turnout: @pct_increase%')])
hover_cols=['state', 'county', 'pct increase']
county_turnout.hvplot(c='pct increase', geo=True, title='Increase in Turnout from 2016 to 2020 by County',
tools=[hover], hover_cols=hover_cols, cmap='PiYG', clim=(-25, 25)) * borders
Now let's turn to the other measure: turnout as a percent of voting age citizens.
hover = HoverTool(tooltips=[('', '@state'),
('', '2016 Turnout: @pct_turnout_2016%'),
('', '2020 Turnout: @pct_turnout_2020%')])
hover_cols=['state', 'pct turnout 2016', 'pct turnout 2020']
state_turnout.hvplot(c='pct turnout 2020', geo=True, title='2020 Voting Age Citizen Turnout by State',
hover_cols=hover_cols, tools=[hover], cmap='greens', clim=(45, 85))
hover = HoverTool(tooltips=[('', '@county, @state'),
('', '2016 Turnout: @pct_turnout_2016%'),
('', '2020 Turnout: @pct_turnout_2020%')])
hover_cols=['state', 'county', 'pct turnout 2016', 'pct turnout 2020']
county_turnout.hvplot(c='pct turnout 2020', geo=True, title='2020 Voting Age Citizen Turnout by County',
tools=[hover], hover_cols=hover_cols, cmap='greens', clim=(45, 85)) * borders
Contrary to Cortes' claim, Wisconsin had a normal turnout. In 2020 turnout rose around the country, and Wisconsin saw a pretty typical increase. Milwaukee county usually has slightly lower turnout than Wisconsin overall, and it saw it's turnout increase the least of any Wisconsin county. The claim of improbable turnout relies on misleading statistics and cherrypicked comparisons.
The main claim in this section is that Biden's improvements over Obama in certain counties are unrealistic. This isn't really a statistical claim, Cortes simply doesn't think that a "doddering and lazy" Biden could do better than Obama's "rockstar appeal". He once again says that these gains came in "just the right places". This time the "just right" places are suburban counties surrounding Philadelphia. Once again I will compare these counties to the rest of the country to see if they really stand out.
#Gets margin data from the overall dataset
county_margin = pres_counties.query("year == 2012 or year == 2020")
index = ['state', 'county', 'FIPS']
values = ['margin', 'margin direction', 'absolute margin']
county_margin = pd.pivot_table(county_margin, index=index, columns='year', values=values, aggfunc=lambda x:x)
county_margin = county_margin.reset_index()
county_margin.columns = county_margin.columns.droplevel(1) + ' ' + county_margin.columns.droplevel(0).map(lambda x: str(x))
county_margin.columns = county_margin.columns.map(lambda x: x.strip())
county_margin = county_geog.merge(county_margin, on='FIPS')
#Calculates change in margin from 2012 to 2020
county_margin['margin shift'] = county_margin['margin 2020'] - county_margin['margin 2012']
county_margin['shift direction'] = county_margin['margin shift'].apply(lambda x: 'R' if x > 0 else 'D')
county_margin['absolute shift'] = county_margin['margin shift'].apply(abs)
#Gets state data
state_margin = pres_states.query("year == 2012 or year == 2020")
state_margin = pd.pivot_table(state_margin, index=['state'], columns='year', values=values, aggfunc=lambda x:x)
state_margin = state_margin.reset_index()
state_margin.columns = state_margin.columns.droplevel(1) + ' ' + state_margin.columns.droplevel(0).map(str)
state_margin.columns = state_margin.columns.map(lambda x: x.strip())
state_margin = state_geog.merge(state_margin, on='state')
#Calculates change in margin from 2012 to 2020
state_margin['margin shift'] = state_margin['margin 2020'] - state_margin['margin 2012']
state_margin['shift direction'] = state_margin['margin shift'].apply(lambda x: 'R' if x > 0 else 'D')
state_margin['absolute shift'] = state_margin['margin shift'].apply(abs)
#Calculates national turnout
national_margin = pres_counties.groupby('year').sum().reset_index()
national_margin['margin'] = national_margin['republican votes'] - national_margin['democrat votes']
national_margin['margin'] = 100*national_margin['margin']/national_margin['total votes']
national_margin['margin direction'] = national_margin['margin'].apply(lambda x: 'R' if x > 0 else 'D')
national_margin['absolute margin'] = national_margin['margin'].apply(abs)
First we see the context of popular vote margin over time.
hover = HoverTool(tooltips=[('', ' Popular Vote Margin: +@absolute_margin% @margin_direction')])
hover_cols= ['absolute margin', 'margin direction']
national_margin['color'] = national_margin['margin direction'].apply(lambda x: 'red' if x=='R' else 'blue')
national_margin.hvplot.bar(x='year', y='absolute margin', xlabel='Year', ylabel='Popular Vote Margin (%)', c='color',
title= 'National Popular Vote Margin from 2000 to 2020', tools=[hover], hover_cols=hover_cols)
To his credit, Cortes chose a good comparison. Biden's 2020 margin of 4.5% is closer to Obama's 2012 margin of 3.9% than any other election this century. These elections were very similar on the national level, so let's see how the compare in Pennsylvania.
hover = HoverTool(tooltips=[('', '@county, @state'),
('', '2012 Margin: +@absolute_margin_2012% @margin_direction_2012'),
('', '2020 Margin: +@absolute_margin_2020% @margin_direction_2020'),
('', 'Shift: +@absolute_shift% @shift_direction')])
hover_cols=['state', 'county', 'absolute margin 2012', 'margin direction 2012',
'absolute margin 2020', 'margin direction 2020', 'absolute shift', 'shift direction']
pennsylvania_margin = county_margin.loc[county_margin['state'] == 'Pennsylvania']
pennsylvania_margin.hvplot(c='margin shift', geo=True, title='Pennsylvania Change in Margin from 2012 to 2020',
tools=[hover], hover_cols = hover_cols, cmap='bwr', clim=(-40, 40))
This is an interesting map. Most of the counties in Pennsylvania have become much more republican. Some have even shifted red by over 40%. Democrats have made gains in the counties surrounding Philadelphia, Pittsburgh, Harrisburg, and Penn State University. The main takeaway of this map is that republicans are improving in rural areas while democrats are improving in suburban areas. The only county on this map that includes city but not suburb is Philadelphia County, which moved towards republicans. To see if this is unusual like Cortes claims, we can compare Pennsylvania to neighboring states.
Pennsylvania_neighbors = ['Pennsylvania', 'Ohio', 'New York', 'New Jersey', 'Delaware', 'Maryland', 'West Virginia']
neighbor_margin = county_margin.loc[county_margin['state'].isin(Pennsylvania_neighbors)]
neighbor_margin.hvplot(c='margin shift', geo=True, title='Change in Margin from 2012 to 2020 Surrounding Pennsylvania',
tools=[hover], hover_cols=hover_cols, cmap='bwr', clim=(-40, 40))
In the states bordering Pennsylvania the same trends hold. Rural counties are shifting republican, some dramatically so. Suburban counties are shifting democratic. We can take another step back and compare this to the whole country.
hover = HoverTool(tooltips=[('', '@county, @state'),
('', '2012 Margin: +@absolute_margin_2012% @margin_direction_2012'),
('', '2020 Margin: +@absolute_margin_2020% @margin_direction_2020'),
('', 'Shift: +@absolute_shift% @shift_direction')])
hover_cols=['state', 'county', 'absolute margin 2012', 'margin direction 2012']
#'absolute margin 2020', 'margin direction 2020']#, 'absolute shift', 'shift direction']
#test_test = county_margin.dropna()
county_margin.hvplot(c='margin shift', geo=True, title='Change in Margin from 2012 to 2020',
tools=[hover], hover_cols=hover_cols, cmap='bwr', clim=(-40, 40))
The pattern holds again at the national level, and there appears to be a regional trend where the midwest is getting redder and the southwest is getting bluer.
hover = HoverTool(tooltips=[('', '@state'),
('', '2012 Margin: +@absolute_margin_2012% @margin_direction_2012'),
('', '2020 Margin: +@absolute_margin_2020% @margin_direction_2020'),
('', 'Shift: +@absolute_shift% @shift_direction')])
hover_cols=['state', 'absolute margin 2012', 'margin direction 2012',
'absolute margin 2020', 'margin direction 2020', 'absolute shift', 'shift direction']
state_margin.hvplot(c='margin shift', geo=True, title='Change in Margin from 2012 to 2020',
tools=[hover], hover_cols=hover_cols, cmap='bwr', clim=(-30, 30))
The regional trend is once again visible on a state level map. Utah really stands out here with a 27% swing in just 8 years. Perhaps this is because Mitt Romney was the republican candidate in 2012, and is now a senator from Utah.
I decided to look a bit closer into the hypothesis that shift is related to whether a county is rural, suburban, or urban. I calculated population density of each county as a measure of urbanization. I plotted population density against change in margin from 2012 to 2020. There is a lot of noise, but a trend is visible. Counties with a population density of around 10 people per square mile swung more red on average than counties with a population density of around 100 people per square mile. There aren't enough counties with very high population densities to make out a clear trend.
county_margin['population density'] = county_margin['population 2019']/county_margin['area']
hover = HoverTool(tooltips=[('', '@county, @state'),
('', 'Shift: +@absolute_shift% @shift_direction')])
hover_cols=['state', 'county', 'absolute shift', 'shift direction']
graph_data = county_margin.loc[:, ['population density', 'state', 'county',
'margin shift', 'absolute shift', 'shift direction']]
graph_data.hvplot.scatter(x='population density', y='margin shift', xlabel=' Log Population Density (people per square mile)',
ylabel='Change in Margin (%)', logx=True, s=10, tools=[hover], hover_cols=hover_cols,
title='Population Density vs Change in Presidential Margin from 2012 to 2020')
Cortes claims that over 450,000 ballots nationwide were cast for Biden with the rest of the ballot left blank. He says there were 95,801 Biden only ballots in Georgia compared to just 818 Trump only ballots. These numbers are calculated by taking Biden's or Trump's vote totals and subtracting that state's Democratic or Republican senate candidates' total. This exact claim was addressed in an Associated Press article. There is simply no way of knowing for certain whether this difference is the result of more Biden ballots being left blank or split ticket voters. It is not unusual for senate races to get fewer votes than presidential races.
While there is no way to know exactly how many 'Biden only ballots' were cast, we get a pretty rough estimate how many ballots were cast for Biden (or Trump) that left the senate race blank by looking at two measures: presidential overperformance of senate candidates and presidential turnout compared to senate turnout.
#Gets presidential and senate data from the overall dataset
index = ['state', 'county', 'FIPS']
values = ['total votes', 'margin', 'margin direction', 'absolute margin']
counties_2020 = pd.pivot_table(counties_2020, index=index, columns='race', values=values, aggfunc=lambda x:x)
counties_2020 = counties_2020.reset_index()
counties_2020.columns = counties_2020.columns.droplevel(1) + ' ' + counties_2020.columns.droplevel(0).map(lambda x: str(x))
counties_2020.columns = counties_2020.columns.map(lambda x: x.strip())
#Calculates difference in turnout
counties_2020['turnout diff'] = counties_2020['total votes pres'] - counties_2020['total votes sen']
counties_2020['turnout pct'] = 100*counties_2020['turnout diff']/counties_2020['total votes sen']
counties_2020['turnout pct'] = counties_2020['turnout pct'].apply(lambda x: np.NaN if x > 100 else x)
counties_2020['total votes sen'] = counties_2020['total votes sen'].fillna(0)
#Calulates difference in margin
counties_2020['margin diff'] = counties_2020['margin pres'] - counties_2020['margin sen']
counties_2020['candidate'] = counties_2020['margin diff'].apply(lambda x: 'Trump' if x > 0 else 'Biden' if x <= 0 else 'N/A')
counties_2020['absolute diff'] = counties_2020['margin diff'].apply(abs)
counties_2020['absolute diff'] = counties_2020['absolute diff'].fillna(0)
counties_2020['absolute margin sen'] = counties_2020['absolute margin sen'].fillna(0)
counties_2020['margin direction sen'] = counties_2020['margin direction sen'].fillna('N/A')
counties_2020 = county_geog.merge(counties_2020, on='FIPS')
#Gets state level presidential and senate data
states_2020 = pd.pivot_table(states_2020, index=['state'], columns='race', values=values, aggfunc=lambda x:x)
states_2020 = states_2020.reset_index()
states_2020.columns = states_2020.columns.droplevel(1) + ' ' + states_2020.columns.droplevel(0).map(str)
states_2020.columns = states_2020.columns.map(lambda x: x.strip())
#calculates difference in turnout
states_2020['turnout diff'] = states_2020['total votes pres'] - states_2020['total votes sen']
states_2020['turnout pct'] = 100*states_2020['turnout diff']/states_2020['total votes sen']
states_2020['turnout pct'] = states_2020['turnout pct'].apply(lambda x: np.NaN if x > 100 else x)
states_2020['total votes sen'] = states_2020['total votes sen'].fillna(0)
#Calculates difference in margin
states_2020['margin diff'] = states_2020['margin pres'] - states_2020['margin sen']
states_2020['candidate'] = states_2020['margin diff'].apply(lambda x: 'Trump' if x > 0 else 'Biden' if x <= 0 else 'N/A')
states_2020['absolute diff'] = states_2020['margin diff'].apply(abs)
states_2020['absolute diff'] = states_2020['absolute diff'].fillna(0)
states_2020['absolute margin sen'] = states_2020['absolute margin sen'].fillna(0)
states_2020['margin direction sen'] = states_2020['margin direction sen'].fillna('N/A')
states_2020 = state_geog.merge(states_2020, on='state')
To start, let's compare the turnout for president and senate in the states that had both elections in 2020.
hover = HoverTool(tooltips=[('', ' Turnout: @turnout million')])
national_2020['election'] = national_2020['race'].apply(lambda x: 'President' if x=='pres' else 'Senate')
national_2020.hvplot.bar(x='election', y='turnout', title= 'Turnout in States with both Presidential and Senate Elections',
xlabel='Election', ylabel='Turnout (millions)', tools=[hover], hover_cols='turnout')
There were only about 1 million more votes cast for president overall in the states that had both elections. We can also compare how each party did in these elections combined.
hover = HoverTool(tooltips=[('', 'Election Margin: +@absolute_margin% @margin_direction')])
hover_cols = ['absolute margin', 'margin direction']
national_2020.hvplot.bar(x='election', y='margin', xlabel='Election', ylabel='Turnout (millions)',
title= 'Combined Margin in States with both Presidential and Senate Elections',
tools=[hover], hover_cols=hover_cols, c='red')
Even though Biden won the overall popular vote convincingly, Trump won the popular vote in the states that had a senate election in 2020. The republican senate candidates in those states won by a much larger margin. Now let's look at Georgia specifically.
hover = HoverTool(tooltips=[('', '@county, @state'),
('', 'President Margin: +@absolute_margin_pres% @margin_direction_pres'),
('', 'Senate Margin: +@absolute_margin_sen% @margin_direction_sen'),
('', 'Difference: +@absolute_diff% @candidate')])
hover_cols=['state', 'county', 'absolute margin pres', 'margin direction pres', 'absolute margin sen', 'margin direction sen',
'absolute diff', 'candidate']
georgia_2020 = counties_2020.loc[counties_2020['state'] == 'Georgia']
georgia_2020.hvplot(c='margin diff', geo=True, title='Georgia Presidential result relative to Senate',
tools=[hover], hover_cols=hover_cols, cmap='bwr', clim=(-10, 10))
Biden did slightly better than the democratic senate candidate in Georgia, Jon Ossoff, in most counties. To see whether this is unusual we can compare this to other states.
hover = HoverTool(tooltips=[('', '@state'),
('', 'President Margin: +@absolute_margin_pres% @margin_direction_pres'),
('', 'Senate Margin: +@absolute_margin_sen% @margin_direction_sen'),
('', 'Difference: +@absolute_diff% @candidate')])
hover_cols=['state', 'absolute margin pres', 'margin direction pres', 'absolute margin sen', 'margin direction sen',
'absolute diff', 'candidate']
states_2020.hvplot(c='margin diff', geo=True, title='Presidential result relative to Senate by State',
tools=[hover], hover_cols=hover_cols, cmap='bwr', clim=(-25, 25))
hover = HoverTool(tooltips=[('', '@county, @state'),
('', 'President Margin: +@absolute_margin_pres% @margin_direction_pres'),
('', 'Senate Margin: +@absolute_margin_sen% @margin_direction_sen'),
('', 'Difference: +@absolute_diff% @candidate')])
hover_cols=['state', 'county', 'absolute margin pres', 'margin direction pres', 'absolute margin sen', 'margin direction sen',
'absolute diff', 'candidate']
counties_2020.hvplot(c='margin diff', geo=True, title='Presidential result relative to Senate by County',
tools=[hover], hover_cols=hover_cols, cmap='bwr', clim=(-25, 25)) * borders
In about half of the states with a senate election Biden did better than the democratic senate candidate. In the other half, Trump did better than the republican senate candidate. Georgia had a relatively small difference. The two states that stand out are Nebraska and Maine.
Nebraska and Maine are the only two states that award electoral college votes based on how each congressional district votes in addition to the whole state's vote. As we can see in the county level map, Nebraskans around Omaha voted for Biden despite also electing republican Ben Sasse. These voters gave Biden one of Nebraska's electoral votes even though Trump won the whole state.
Maine's tilt towards Biden doesn't seem to change much by county. This is probably because moderate voters around the state supported famously moderate republican Susan Collins while also voting for Biden.
Southern and Eastern Arkansas show another interesting trend. The Arkansas senate race had no democratic candidate, only a republican and a libertarian. Biden convincingly overperformed the libertarian in the rural, majority black counties near the Mississippi Delta. The democratic senate candidates in nearby Mississippi and Alabama ran ahead of Biden.
Now we will look at how turnout in senate and presidential elections compared.
georgia_2020 = counties_2020.loc[counties_2020['state'] == 'Georgia']
georgia_2020.hvplot(c='turnout pct', geo=True, title='Georgia Presidential turnout relative to Senate',
tools=[hover], hover_cols=hover_cols, cmap='greens', clim=(0, 10))
Throughout the entire state of Georgia, presidential race saw slightly higher turnout than the senate race. We can compare this to the rest of the country.
hover = HoverTool(tooltips=[('', '@state'),
('', 'President Turnout: @total_votes_pres'),
('', 'Senate Turnout: @total_votes_sen'),
('', 'Difference: +@turnout_pct% President')])
hover_cols=['state', 'total votes pres', 'total votes sen', 'turnout pct']
states_2020.hvplot(c='turnout pct', geo=True, title='Presidential turnout relative to Senate by State',
tools=[hover], hover_cols=hover_cols, cmap='greens', clim=(0, 10))
hover = HoverTool(tooltips=[('', '@county, @state'),
('', 'President Turnout: @total_votes_pres'),
('', 'Senate Turnout: @total_votes_sen'),
('', 'Difference: +@turnout_pct% President')])
hover_cols=['state', 'county', 'total votes pres', 'total votes sen', 'turnout pct']
counties_2020.hvplot(c='turnout pct', geo=True, title='Presidential turnout relative to Senate by County',
tools=[hover], hover_cols=hover_cols, cmap='greens', clim=(0, 10)) * borders
Georgia is entirely unremarkable here, so let's return to the states we were looking at earlier.
Nebraska saw much higher presidential turnout that senate concentrated around Omaha, where Biden outperformed the senate democrat. This suggests that there really were a lot of Biden only ballots.[3] In Omaha.
Despite a big difference in results between Biden and democratic senate candidate Sara Gideon, the turnouts between Maine's senate and presidential races were very close across the whole state. This suggests that there were a lot of Biden-Collins split ticket votes dispersed across the whole state.
The same counties in Arkansas where Biden outran the libertarian senate candidate, there were a lot more presidential voters than senate voters. This once again suggests Biden only voters.
Cortes claims that mail-in ballots were not properly vetted. He specifically mentions Pennsylvania as a state he thinks did not properly vet its mail-in ballots. His evidence for this is a mail-in ballot rejection rate of 0.03%, which is unusually low. Cortes does not cite a source for this number, and I was unable to find any source to confirm this. In fact, complete data on the number of mail-in ballots rejected in Pennsylvania is not yet available. Ballotpedia lists mail-in rejection rate by state, but does not yet have a number listed for Pennsylvania, or a majority of states. The states that have data listed do not seem unusual compared to past years. A USA Today story states:
Philadelphia's final mail ballot rejection numbers are still being sorted out, along with other counties and states: It could be spring before the full number of rejected ballots is known.
So where could Cortes have gotten this number? I suspect he got it from a justthenews.com article first published on November 6, 2020. This article provides a 0.03% rejection rate, and even suggested that the number raised "potential questions." Cortes published his article 3 days later. The justthenews.com article was later updated with the following correction:
Correction: An earlier version of this story incorrectly reported a premature, overall mail-in ballot rejection rate in Pennsylvania on the basis of a partial, early count of rejected absentee ballots current as of Nov. 5. A final mail ballot rejection rate "is typically not available until some weeks after the election, once all ballots have been canvassed by counties," a spokeswoman for the Pennsylvania Department of State has clarified. In this cycle, she added, "The canvassing of mail ballots in some counties continued through the week of November 9."
Cortes repeated a false rejection rate as evidence of election irregularities. He has not corrected his article.
I have analyzed election data to test the four claims of election irregularities that Steve Cortes makes in his article. None of these claims holds up under closer inspection. Much of his analysis is based on flawed calculation, or complete misinformation. He cherry-picks states to make his points while ignoring nationwide trends. He cites improbable sounding, but statistically meaningless numbers. For example, that Biden doubled Obama's margin in a Pennsylvania county that Obama won by a small margin 8 years earlier.
The areas that Cortes called unusual were not unusual, but there were some places that actually were unusual. For example, the rural, majority Hispanic, Starr County, Texas saw an in turnout of 49% from 2016. The county swung towards Trump by 68% relative to 2012. Trump surpassed Republican senator John Cornyn's margin by 10%, on 18% higher turnout in the presidential election. This suggests the presence of Trump only voters. Every single one of these numbers stand out. These are some of the most extreme swings of any county in the US. Is this evidence of fraud favoring Trump in Starr County?
No. This was simply a county where Trump did extraordinarily well compared to past elections. In a country with over 3,000 counties, some counties are going to be extraordinary. There will also always be trends between elections. In 2020 suburbs shifted blue and rural areas shifted red. If votes didn't change between elections there would be no need to have them every 4 years. Cortes takes a different view. He treats every unexpected outcome as inconceivable. He portrays broad trends as inexplicable anomalies.
All of this serves his article's central thesis: that Biden's victory was statistically improbable. This conclusion is truly absurd. Biden got more votes than Trump in a high turnout election where he overperformed Obama in suburbs and underperformed Obama in cities. He outran some democratic senate candidates, and underran others. There were fewer senate votes than presidential votes. None of this is irregular. None of this is improbable. There is simply no statistical case against Biden's win.
[0]There is a very real and important field that uses statistics to look for election fraud. Cortes does not use these statistical methods, or any other statistical methods. He simply uses individual examples out of context. There are also studies that examine other claims of election irregularities in the 2020 election.
[1]The vast majority of counties in the US are just counties, but not all of them. All the counties in Louisiana are called Parishes. Washington DC is a federal district. Then there are some cities that are their own jurisdictions, and not in any county. Baltimore, MD; Carson City, NV; and St. Louis, MO are all independent cities. The remaining 38 independent cities are in Virginia. In Virginia every town of at least 10,000 becomes an independent city (in theory). Some of these cities share a name with a county in Virginia. Baltimore and St. Louis also share names with a county in Maryland and Missouri respectively. This all makes it hard to avoid false matches (Baltimore City vs Baltimore County), so I decided to classify every jurisdiction in my database as a county, city, parish or district. This process was further complicated by three edge cases, all in Virginia. Charles City County is a county in Virginia. It is not an independent city despite its name. This is also true of James City County, Virginia. Bedford City, Virginia lost its independent status in 2013, and was absorbed into Bedford County. This happened right in the middle of the time period I am analyzing.
[2]It can be very apparent when states do change their rules. For example, Georgia implemented automatic voter registration between 2016 and 2020 and saw turnout increase to record levels.
[3] In this context "Biden only" means a lot of voters probably voted for Biden and not the senate. We don't know whether these voters voted in other races down ballot. It is also possible there were a lot of Trump voters who didn't vote for senate, and a lot of Biden voters who also voted for Senator Sasse. I suspect it was mostly "Biden only" voters, but there is no way to know for certain.