Bartik, Shift Share, and Formula Instruments

Joel Ferguson

Basics

\(Z_{i,t} = \sum\limits_{k\in\mathcal{K}}w^k_{i}g^k_{t}\)

\(w^k_{i}\): Shares, e.g. share of employment in industry \(k\)
\(g^k_{t}\): Shifts, e.g. growth rate of industry \(k\)

Three Paths to Identification

Shares are Exogenous: Goldsmith-Pinkham, Sorkin, and Swift (2020) AER
Shifts are Exogenous: Borusyak, Hull, and Jaravel (2022) REStud
Special Case of Non-Random Exposure to Exogenous Shocks: Borusyak and Hull (forthcoming) Ecma

Differ in what variation drives causal identification and how to do statistical inference

Shares are Exogenous

SSIV is numerically equivalent to GMM with shares as instruments

Weighting matrix: Outer product of the shifts

In this case, identifying assumption is that shares are uncorrelated with unobserved determinants of outcome, conditional on controls

Much more palatable if outcomes is in changes and shares are held constant at baseline values

Shares are Exogenous: Math

\(Y_{i,t} = \alpha + \beta D_{i,t} + X_{i,t} \Gamma+ \varepsilon_{i,t}\)

Assumption: \(\mathbb{E}[\varepsilon_{i,t}w^k_{i}|X_{i,t}]=0\)

Instrument: \(Z_{i,t} = \sum\limits_{k\in\mathcal{K}}w^k_{i}g^k_{t}\)

Equivalent to GMM with \(W\) as instruments \(G_tG_t'\) as weighting matrix

\(\frac{X^{\perp'}W(GG')W'Y^{\perp}}{X^{\perp'}W(GG')W'X^{\perp}}\rightarrow\beta\)

Shares are Exogenous: Code

set.seed(1115)
N = 3000 # Number of units
K = 10 # Number of industries

w = matrix(runif(N*K),
           nrow=N,
           ncol=K) # Shares are an NxK matrix
# Normalize shares to sum to 1
w = w/as.vector((w%*%matrix(1,nrow=K,ncol=1))) 

g = matrix(runif(K,-1,1),
           nrow=K,
           ncol=1) # Shifts are Kx1 vector

Z = w%*%g # Shift share

X = runif(N,-1,1) # Control
e = rnorm(N,0,0.1)  # Errors 
D = as.vector(Z) + e # Treatment correlated with error

Y = D + X + e # True effect of treatment is 1

# Put stuff in matrices
Y = matrix(Y,nrow=N,ncol=1)

rhs = cbind(
  matrix(1,nrow=N,ncol=1), # Constant
  matrix(X,nrow=N,ncol=1), # Control
  matrix(D,nrow=N,ncol=1)  # Treatment
)

rhs_short = cbind(
  matrix(1,nrow=N,ncol=1), # Constant
  matrix(X,nrow=N,ncol=1) # Control
)

# Annhilator matrix
annh = diag(N) - 
  rhs_short%*%solve(t(rhs_short)%*%rhs_short)%*%t(rhs_short)


inst = cbind(
  matrix(1,nrow=N,ncol=1), # Constant
  matrix(X,nrow=N,ncol=1), # Control
  Z  # instrument
)

shift_share_iv_coef = solve(t(inst)%*%rhs) %*% (t(inst)%*%Y)
print(paste("Shift Share IV Coef:", shift_share_iv_coef[3,1]))

[1] "Shift Share IV Coef: 1.00437876156168"

Shares are Exogenous: Code

Y_fwl = annh%*%Y
D_fwl = annh%*%D

gmm_coef = (t(D_fwl)%*%w%*%(g%*%t(g))%*%t(w)%*%Y_fwl) /
  (t(D_fwl)%*%w%*%(g%*%t(g))%*%t(w)%*%D_fwl)
print(paste("Shift Share IV Coef:", shift_share_iv_coef[3,1]))

[1] "Shift Share IV Coef: 1.00437876156168"

print(paste("GMM Coef:", gmm_coef))

[1] "GMM Coef: 1.00437876156166"

Shares are Exogenous: Takeaways

If shares are exogenous, SSIV is numerically equivalent to GMM with shares as instruments and shifts as weights
Can be more palatable if outcome is specified in changes and shares are set at some baseline level
Generally hard to think of a design that generates this assumption though

Shifts are Exogenous

There is a numerical equivalence between SSIV and a shift-level IV where the shift is the instrument

Outcome: share-weighted outcome
Treatment: share-weighted treatment

Here, identification comes from shifts being uncorrelated with unobserved determinants, rather than shares

Potentially lends itself more easily to design-based inference

Shifts are Exogenous: Math

Standard IV orthogonality is equivalent to shock-level exogeneity

\(\mathbb{E}[\sum\limits_{k\in\mathcal{K}} w^kg_t^k\bar{\varepsilon}_{k,t}]=0\)

Where \(\bar{\varepsilon}_{k,t}=\frac{\sum\limits_{i}w_{i}^k\varepsilon_{i,t}}{\sum\limits_{i}w_i^k}\)

And SSIV estimator is equivalent to a share-weighted IV estimator using shocks \(g_t^k\) as instruments

\(\bar{Y}^{\perp}_{n,t}=\beta \bar{D}^{\perp}_{n,t}+\bar{\varepsilon}^{\perp}_{n,t}\)

Shifts are Exogenous: Code

Y_bar = t(w)%*%Y_fwl # Weight FWL'd outcomes by shares
# Y_bar is KxN%*%Nx1 = Kx1
D_bar = t(w)%*%D_fwl # Same for treatment

shock_iv_coef = solve(t(g)%*%D_bar)%*%(t(g)%*%Y_bar) # IV
print(paste("Shift Share IV Coef:", shift_share_iv_coef[3,1]))

[1] "Shift Share IV Coef: 1.00437876156168"

print(paste("Shock-level IV Coef:", shock_iv_coef))

[1] "Shock-level IV Coef: 1.00437876156166"

Shifts are Exogenous: Takeaways

SSIV is equivalent to shock-level IV with shock as instrument
Shocks are level of treatment assignment, so need many shocks for good inference
Easier to think of how a design assigns shocks than shares

SSIV as a Special Case of Non-Random Exposure

\(Z_{i,t} = \sum\limits_{k\in\mathcal{K}}w^k_{i}g^k_{t}\) is a special case of \(Z_{i,t}=h(g_t;w_i)\)

When shares are correlated with unobserved determinants of the outcome and shifts aren’t all mean zero, SSIV can be biased

Solution: Re-center instrument around expectation \(Z_{i,t}-\mathbb{E}[Z_{i,t}]\)

SSIV as a Special Case of Non-Random Exposure: Math

\(\frac{1}{NT}\sum\limits_{i,t} \mathbb{E}[Z_{i,t}\varepsilon_{i,t}]=\frac{1}{NT}\sum\limits_{i,t} \mathbb{E}[\mathbb{E}[h(g_t;w_i)|w_i]\varepsilon_{i,t}]\)

Not necessarily zero unless observation-level expected shocks are uncorrelated with unobservables

Re-centering mechanically removes correlation

SSIV as a Special Case of Non-Random Exposure: Code

g_means = runif(K,0,5)

g = matrix(sapply(g_means,function(x) rnorm(1,x,1)),
              nrow=K,
              ncol=1) # Shifts are Kx1 vector

Z = w%*%g # Shift share
mu = w%*%g_means
Z_rc = Z-mu

# Errors correlated with expected instrument
e1 = sapply(as.vector(mu),
            function(x) rnorm(1,x,0.3))  
e2 = sapply(e1,
            function(x) rnorm(1,x,0.1))

D = as.vector(Z) + e1 # Treatment correlated with error

Y = D + X + e2 # True effect of treatment is 1

# Put stuff in matrices
Y = matrix(Y,nrow=N,ncol=1)

rhs = cbind(
  matrix(1,nrow=N,ncol=1), # Constant
  matrix(X,nrow=N,ncol=1), # Control
  matrix(D,nrow=N,ncol=1)  # Treatment
)

rhs_short = cbind(
  matrix(1,nrow=N,ncol=1), # Constant
  matrix(X,nrow=N,ncol=1) # Control
)

# Annhilator matrix
annh = diag(N) - 
  rhs_short%*%solve(t(rhs_short)%*%rhs_short)%*%t(rhs_short)


inst = cbind(
  matrix(1,nrow=N,ncol=1), # Constant
  matrix(X,nrow=N,ncol=1), # Control
  Z  # instrument
)

shift_share_iv_coef = solve(t(inst)%*%rhs) %*% (t(inst)%*%Y)
print(paste("Shift Share IV Coef:", shift_share_iv_coef[3,1]))

[1] "Shift Share IV Coef: 1.38369346953507"

SSIV as a Special Case of Non-Random Exposure: Code

recentered_iv_coef = solve(t(Z_rc)%*%D_fwl) %*% (t(Z_rc)%*%Y_fwl)
print(paste("Re-centered IV Coef:", recentered_iv_coef))

[1] "Re-centered IV Coef: 0.957521500615535"

SSIV as a Special Case of Non-Random Exposure: Takeaways

Problems can arise in SSIV when expected exposure to shifts is correlated with unobservables
Possible to do proper design-based inference even in these tricky cases
Requires knowledge of shock distribution to perform re-centering and do finite-sample inference

A Word on Inference

With exogenous shares it isn’t immediately obvious how to do proper statistical inference
With exogenous shifts heteroskedasticity robust SEs of shock-level regression are valid under large number of uncorrelated shocks assumption (similar conclusions from Adão et al. 2019)
With non-random exposure, finite sample randomization inference is possible using correlation between (simulated) re-centered instrument and (FWL’d) outcome