Bartik, Shift Share, and Formula Instruments

Joel Ferguson

Basics

\(Z_{i,t} = \sum\limits_{k\in\mathcal{K}}w^k_{i}g^k_{t}\)

  • \(w^k_{i}\): Shares, e.g. share of employment in industry \(k\)

  • \(g^k_{t}\): Shifts, e.g. growth rate of industry \(k\)

Three Paths to Identification

  • Shares are Exogenous: Goldsmith-Pinkham, Sorkin, and Swift (2020) AER

  • Shifts are Exogenous: Borusyak, Hull, and Jaravel (2022) REStud

  • Special Case of Non-Random Exposure to Exogenous Shocks: Borusyak and Hull (forthcoming) Ecma

Differ in what variation drives causal identification and how to do statistical inference

Shares are Exogenous

SSIV is numerically equivalent to GMM with shares as instruments

  • Weighting matrix: Outer product of the shifts

In this case, identifying assumption is that shares are uncorrelated with unobserved determinants of outcome, conditional on controls

  • Much more palatable if outcomes is in changes and shares are held constant at baseline values

Shares are Exogenous: Math

\(Y_{i,t} = \alpha + \beta D_{i,t} + X_{i,t} \Gamma+ \varepsilon_{i,t}\)

Assumption: \(\mathbb{E}[\varepsilon_{i,t}w^k_{i}|X_{i,t}]=0\)

Instrument: \(Z_{i,t} = \sum\limits_{k\in\mathcal{K}}w^k_{i}g^k_{t}\)

Equivalent to GMM with \(W\) as instruments \(G_tG_t'\) as weighting matrix

\(\frac{X^{\perp'}W(GG')W'Y^{\perp}}{X^{\perp'}W(GG')W'X^{\perp}}\rightarrow\beta\)

Shares are Exogenous: Code

set.seed(1115)
N = 3000 # Number of units
K = 10 # Number of industries

w = matrix(runif(N*K),
           nrow=N,
           ncol=K) # Shares are an NxK matrix
# Normalize shares to sum to 1
w = w/as.vector((w%*%matrix(1,nrow=K,ncol=1))) 

g = matrix(runif(K,-1,1),
           nrow=K,
           ncol=1) # Shifts are Kx1 vector

Z = w%*%g # Shift share

X = runif(N,-1,1) # Control
e = rnorm(N,0,0.1)  # Errors 
D = as.vector(Z) + e # Treatment correlated with error

Y = D + X + e # True effect of treatment is 1

# Put stuff in matrices
Y = matrix(Y,nrow=N,ncol=1)

rhs = cbind(
  matrix(1,nrow=N,ncol=1), # Constant
  matrix(X,nrow=N,ncol=1), # Control
  matrix(D,nrow=N,ncol=1)  # Treatment
)

rhs_short = cbind(
  matrix(1,nrow=N,ncol=1), # Constant
  matrix(X,nrow=N,ncol=1) # Control
)

# Annhilator matrix
annh = diag(N) - 
  rhs_short%*%solve(t(rhs_short)%*%rhs_short)%*%t(rhs_short)


inst = cbind(
  matrix(1,nrow=N,ncol=1), # Constant
  matrix(X,nrow=N,ncol=1), # Control
  Z  # instrument
)

shift_share_iv_coef = solve(t(inst)%*%rhs) %*% (t(inst)%*%Y)
print(paste("Shift Share IV Coef:", shift_share_iv_coef[3,1]))
[1] "Shift Share IV Coef: 1.00437876156168"

Shares are Exogenous: Code

Y_fwl = annh%*%Y
D_fwl = annh%*%D

gmm_coef = (t(D_fwl)%*%w%*%(g%*%t(g))%*%t(w)%*%Y_fwl) /
  (t(D_fwl)%*%w%*%(g%*%t(g))%*%t(w)%*%D_fwl)
print(paste("Shift Share IV Coef:", shift_share_iv_coef[3,1]))
[1] "Shift Share IV Coef: 1.00437876156168"
print(paste("GMM Coef:", gmm_coef))
[1] "GMM Coef: 1.00437876156166"

Shares are Exogenous: Takeaways

  • If shares are exogenous, SSIV is numerically equivalent to GMM with shares as instruments and shifts as weights

  • Can be more palatable if outcome is specified in changes and shares are set at some baseline level

  • Generally hard to think of a design that generates this assumption though

Shifts are Exogenous

There is a numerical equivalence between SSIV and a shift-level IV where the shift is the instrument

  • Outcome: share-weighted outcome
  • Treatment: share-weighted treatment

Here, identification comes from shifts being uncorrelated with unobserved determinants, rather than shares

  • Potentially lends itself more easily to design-based inference

Shifts are Exogenous: Math

Standard IV orthogonality is equivalent to shock-level exogeneity

\(\mathbb{E}[\sum\limits_{k\in\mathcal{K}} w^kg_t^k\bar{\varepsilon}_{k,t}]=0\)

Where \(\bar{\varepsilon}_{k,t}=\frac{\sum\limits_{i}w_{i}^k\varepsilon_{i,t}}{\sum\limits_{i}w_i^k}\)

And SSIV estimator is equivalent to a share-weighted IV estimator using shocks \(g_t^k\) as instruments

\(\bar{Y}^{\perp}_{n,t}=\beta \bar{D}^{\perp}_{n,t}+\bar{\varepsilon}^{\perp}_{n,t}\)

Shifts are Exogenous: Code

Y_bar = t(w)%*%Y_fwl # Weight FWL'd outcomes by shares
# Y_bar is KxN%*%Nx1 = Kx1
D_bar = t(w)%*%D_fwl # Same for treatment

shock_iv_coef = solve(t(g)%*%D_bar)%*%(t(g)%*%Y_bar) # IV
print(paste("Shift Share IV Coef:", shift_share_iv_coef[3,1]))
[1] "Shift Share IV Coef: 1.00437876156168"
print(paste("Shock-level IV Coef:", shock_iv_coef))
[1] "Shock-level IV Coef: 1.00437876156166"

Shifts are Exogenous: Takeaways

  • SSIV is equivalent to shock-level IV with shock as instrument

  • Shocks are level of treatment assignment, so need many shocks for good inference

  • Easier to think of how a design assigns shocks than shares

SSIV as a Special Case of Non-Random Exposure

\(Z_{i,t} = \sum\limits_{k\in\mathcal{K}}w^k_{i}g^k_{t}\) is a special case of \(Z_{i,t}=h(g_t;w_i)\)

When shares are correlated with unobserved determinants of the outcome and shifts aren’t all mean zero, SSIV can be biased

  • Solution: Re-center instrument around expectation \(Z_{i,t}-\mathbb{E}[Z_{i,t}]\)

SSIV as a Special Case of Non-Random Exposure: Math

\(\frac{1}{NT}\sum\limits_{i,t} \mathbb{E}[Z_{i,t}\varepsilon_{i,t}]=\frac{1}{NT}\sum\limits_{i,t} \mathbb{E}[\mathbb{E}[h(g_t;w_i)|w_i]\varepsilon_{i,t}]\)

Not necessarily zero unless observation-level expected shocks are uncorrelated with unobservables

  • Re-centering mechanically removes correlation

SSIV as a Special Case of Non-Random Exposure: Code

g_means = runif(K,0,5)

g = matrix(sapply(g_means,function(x) rnorm(1,x,1)),
              nrow=K,
              ncol=1) # Shifts are Kx1 vector

Z = w%*%g # Shift share
mu = w%*%g_means
Z_rc = Z-mu

# Errors correlated with expected instrument
e1 = sapply(as.vector(mu),
            function(x) rnorm(1,x,0.3))  
e2 = sapply(e1,
            function(x) rnorm(1,x,0.1))

D = as.vector(Z) + e1 # Treatment correlated with error

Y = D + X + e2 # True effect of treatment is 1

# Put stuff in matrices
Y = matrix(Y,nrow=N,ncol=1)

rhs = cbind(
  matrix(1,nrow=N,ncol=1), # Constant
  matrix(X,nrow=N,ncol=1), # Control
  matrix(D,nrow=N,ncol=1)  # Treatment
)

rhs_short = cbind(
  matrix(1,nrow=N,ncol=1), # Constant
  matrix(X,nrow=N,ncol=1) # Control
)

# Annhilator matrix
annh = diag(N) - 
  rhs_short%*%solve(t(rhs_short)%*%rhs_short)%*%t(rhs_short)


inst = cbind(
  matrix(1,nrow=N,ncol=1), # Constant
  matrix(X,nrow=N,ncol=1), # Control
  Z  # instrument
)

shift_share_iv_coef = solve(t(inst)%*%rhs) %*% (t(inst)%*%Y)
print(paste("Shift Share IV Coef:", shift_share_iv_coef[3,1]))
[1] "Shift Share IV Coef: 1.38369346953507"

SSIV as a Special Case of Non-Random Exposure: Code

recentered_iv_coef = solve(t(Z_rc)%*%D_fwl) %*% (t(Z_rc)%*%Y_fwl)
print(paste("Re-centered IV Coef:", recentered_iv_coef))
[1] "Re-centered IV Coef: 0.957521500615535"

SSIV as a Special Case of Non-Random Exposure: Takeaways

  • Problems can arise in SSIV when expected exposure to shifts is correlated with unobservables

  • Possible to do proper design-based inference even in these tricky cases

  • Requires knowledge of shock distribution to perform re-centering and do finite-sample inference

A Word on Inference

  • With exogenous shares it isn’t immediately obvious how to do proper statistical inference

  • With exogenous shifts heteroskedasticity robust SEs of shock-level regression are valid under large number of uncorrelated shocks assumption (similar conclusions from Adão et al. 2019)

  • With non-random exposure, finite sample randomization inference is possible using correlation between (simulated) re-centered instrument and (FWL’d) outcome