You can find the source code for this file in the class repository. The direct link is here.

Let’s start with studying static labor supply. We will consider the decision of the agent under the following rule:

\[ \max_{c,h} \frac{c^{1+\eta}}{1+\eta} - \beta \frac{h^{1+\gamma}}{1+\gamma}\\ \text{s.t. } c = \rho \cdot w\cdot h -r + \mu - \beta_0 \cdot 1[h>0] \\ \] The individual takes his wage \(w\) as given, he chooses hours of work \(h\) and consumption \(c\) subject to a given non labor income \(\mu\) as well as a tax regime defined by \(\rho,r\). \(\beta_0\) is a fixed cost associated with working. We ignore participation here, see the homework for including it.

We note already that the non labor income can control for dynamic labor supply since we can have \(\mu= b_t - (1+r)b_{t+1}\). This is part of a larger maximization problem where the agents choose optimaly \(b_t\) over time. We will get there next time.

Interior solution

The first order conditions give us \(w(wh +r - \mu)^\eta = \beta h^\gamma\). There is no closed-form but we can very quickly find an interior solution by using Newton maximization on the function \(f(x) = w(wh +r - \mu)^\eta - \beta h^\gamma\). We iterate on

\[x \leftarrow x - f(x)/f'(x).\]

# function which updates choice of hours using Newton step
# R here is total unearned income (including taxes when not working and all)
ff.newt <- function(x,w,R,eta,gamma,beta) {
  f0 = w*(w*x + R)^eta - beta*x^gamma
  f1 =  eta*w^2 * (w*x + R)^(eta-1) - gamma * beta *x^(gamma-1)
  x  = x - f0/f1 
  x  = ifelse(w*x + R<=0, -R/w + 0.0001,x) # make sure we do not step out of bounds for next iteration
  x  = ifelse(x<0, 0.0001,x)
  x
}

Simulating data

We are going to simulate a data set where agents will choose participation as well as the number of hours if they decide to work. To do that we will solve for the interior solution under a given tax rate and compare this to the option of no-work.

p  = list(eta=-1.5,gamma = 0.8,beta=1,nu=0.5) # define preferences
tx = list(rho=1,r=0) # define a simple tax
N=50000
simdata = data.table(i=1:N,X=rnorm(N))
simdata <- simdata[,lw := X     + rnorm(N)*0.2];      # add a wage which depends on X
simdata <- simdata[,mu := exp(0.3*X + rnorm(N)*0.2)]; # add non-labor income that also depends on X

# we then solve for the choice of hours and consumption
simdata <- simdata[, h := pmax(-mu+tx$r ,0)/exp(lw)+1] # starting value
# for loop for newton method (30 should be enough, it is fast)
for (i in 1:30) {
  simdata[, h := ff.newt(h,tx$rho*exp(lw),mu-tx$r,p$eta,p$gamma,p$beta) ]
}

# attach consumption, value of working
simdata <- simdata[, c  := tx$rho*exp(lw)*h + mu];
simdata <- simdata[, u1 := c^(1+p$eta)/(1+p$eta) - p$beta * h^(1+p$gamma)/(1+p$gamma) ];

At this point we can regress \(\log(w)\) on \(\log(c)\) and \(\log(h)\) and find precisely the parameters of labor supply:

pander(summary(simdata[,lm(lw ~ log(c) + log(h))]))
  Estimate Std. Error t value Pr(>|t|)
(Intercept) -3.063e-14 2.066e-16 -148.3 0
log(c) 1.5 1.068e-16 1.405e+16 0
log(h) 0.8 2.02e-16 3.96e+15 0
Fitting linear model: lw ~ log(c) + log(h)
Observations Residual Std. Error \(R^2\) Adjusted \(R^2\)
50000 1.217e-14 1 1

The regression still works, among ecah individual who chooses to work, the FOC is still satified.

pander(summary(simdata[,lm(lw ~ log(c) + log(h))]))
  Estimate Std. Error t value Pr(>|t|)
(Intercept) -3.063e-14 2.066e-16 -148.3 0
log(c) 1.5 1.068e-16 1.405e+16 0
log(h) 0.8 2.02e-16 3.96e+15 0
Fitting linear model: lw ~ log(c) + log(h)
Observations Residual Std. Error \(R^2\) Adjusted \(R^2\)
50000 1.217e-14 1 1

Heterogeneity in \(\beta\)

Finally we want to add heterogeneity in the \(\beta\) parameter.

simdata <- simdata[,betai := exp(p$beta*X+rnorm(N)*0.1)]
simdata <- simdata[, h := pmax(-mu+tx$r ,0)/exp(lw)+1]
for (i in 1:30) {
  simdata <- simdata[, h := ff.newt(h,tx$rho*exp(lw),mu-tx$r,p$eta,p$gamma,betai) ]
}

# attach consumption
simdata <- simdata[, c  := exp(lw)*h + mu];

# let's check that the FOC holds
sfit = summary(simdata[,lm(lw ~ log(c) + log(h) + log(betai))])
expect_equivalent(sfit$r.squared,1)
expect_equivalent(coef(sfit)["log(c)",1],-p$eta)
expect_equivalent(coef(sfit)["log(h)",1],p$gamma)

sfit2 = summary(simdata[,lm(lw ~ log(c) + log(h))])
expect_false(coef(sfit2)["log(c)",1]==-p$eta)

sfit3 = summary(simdata[,lm(lw ~ log(c) + log(h) + X)])
pander(sfit2)
  Estimate Std. Error t value Pr(>|t|)
(Intercept) -1.159 0.00186 -623.3 0
log(c) 2.318 0.008018 289.1 0
log(h) -0.2843 0.004372 -65.02 0
Fitting linear model: lw ~ log(c) + log(h)
Observations Residual Std. Error \(R^2\) Adjusted \(R^2\)
50000 0.2648 0.9323 0.9323

Simple case of \(\eta=0\)

p  = list(eta=0,gamma = 0.8,beta=1,beta0=0) # define preferences
tx = list(rho=1,r=0) # define a simple tax
N=500
simdata = data.table(i=1:N,X=rnorm(N))
simdata <- simdata[,lw := X     + rnorm(N)*0.2];      # add a wage which depends on X
simdata <- simdata[,mu := exp(0.3*X + rnorm(N)*0.2)]; # add non-labor income that also depends on X
simdata <- simdata[,eps := rnorm(N)*0.1]
simdata <- simdata[,betai := exp(0.5*X+eps)]
simdata <- simdata[, h := (tx$rho*exp(lw)/betai)^(1/p$gamma)]

sfit3 = summary(simdata[,lm(log(h) ~ lw + X)])
pander(sfit3)
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.002594 0.005537 0.4686 0.6396
lw 1.303 0.02833 46.01 2.824e-181
X -0.6797 0.02841 -23.92 1.018e-84
Fitting linear model: log(h) ~ lw + X
Observations Residual Std. Error \(R^2\) Adjusted \(R^2\)
500 0.1234 0.9663 0.9662
sfit4 = summary(simdata[,lm(lw ~ log(h) + X)])
pander(sfit4)
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 6.905e-05 0.003824 0.01806 0.9856
log(h) 0.6213 0.0135 46.01 2.824e-181
X 0.6094 0.008978 67.88 1.598e-253
Fitting linear model: lw ~ log(h) + X
Observations Residual Std. Error \(R^2\) Adjusted \(R^2\)
500 0.08522 0.9929 0.9929

Then we can construct a counter-factual revenue

p2  = list(eta=0,gamma = 1/sfit3$coefficients["lw","Estimate"],beta=1,beta0=0)
tx2 = tx
tx2$rho = 0.9
simdata <- simdata[, h2 := (tx2$rho*exp(lw)/betai)^(1/p2$gamma)]

simdata[, list(totearnings =mean(exp(lw+h)), R1=mean((1-tx$rho)*exp(lw+h)),R2=mean((1-tx2$rho)*exp(lw+h2)) ,R3=mean((1-tx2$rho)*exp(lw+h)) )]
##    totearnings R1     R2       R3
## 1:    38.76464  0 2.8635 3.876464

Heterogeneity in \(\beta\) revisited

Finally we want to add heterogeneity in the \(\beta\) parameter.

p  = list(eta=-1.5,gamma = 0.8,beta=1,beta0=0) # define preferences
tx = list(rho=1,r=0) # define a simple tax
N=5000
simdata = data.table(i=1:N,X=rnorm(N))
simdata <- simdata[,lw := X     + rnorm(N)*0.2];      # add a wage which depends on X
simdata <- simdata[,mu := exp(0.3*X + rnorm(N)*0.2)]; # add non-labor income that also depends on X
simdata <- simdata[,eps := rnorm(N)*0.1]
simdata <- simdata[,betai := exp(0.5*X+eps)]
simdata <- simdata[, h := (tx$rho*exp(lw)/betai)^(1/p$gamma)]
simdata <- simdata[, betai := exp(p$beta*X+rnorm(N)*0.1)]
simdata <- simdata[, h := pmax(-mu+tx$r ,0)/exp(lw)+1]
for (i in 1:30) {
  simdata <- simdata[, h := ff.newt(h,tx$rho*exp(lw),mu-tx$r,p$eta,p$gamma,betai) ]
}

# attach consumption
simdata <- simdata[, c  := exp(lw)*h + mu];

# let's check that the FOC holds
sfit = summary(simdata[,lm(lw ~ log(c) + log(h) + log(betai))])
expect_equivalent(sfit$r.squared,1)
expect_equivalent(coef(sfit)["log(c)",1],-p$eta)
expect_equivalent(coef(sfit)["log(h)",1],p$gamma)

sfit2 = summary(simdata[,lm(lw ~ log(c) + log(h))])
expect_false(coef(sfit2)["log(c)",1]==-p$eta)

sfit3 = summary(simdata[,lm(lw ~ log(c) + log(h) + X)])
pander(sfit2)
  Estimate Std. Error t value Pr(>|t|)
(Intercept) -1.155 0.005661 -204 0
log(c) 2.286 0.02511 91.03 0
log(h) -0.2993 0.01365 -21.92 8.731e-102
Fitting linear model: lw ~ log(c) + log(h)
Observations Residual Std. Error \(R^2\) Adjusted \(R^2\)
5000 0.2588 0.936 0.9359

Next we have established the following condition in the class:

\[ E \left[ \log(w) - \gamma \log h - \eta \log c + \nu x | W,R,X \right] = 0 \] This suggests an IV strategy where we instrument using \(W,R,X\). We then first regress \(h\) and \(c\) on these thre varaibles to get the predicted component, that we then use in a regression.

fit_h = simdata[, lm( log(h) ~ lw + log(mu) + X ) ]
fit_c = simdata[, lm( log(c) ~ lw + log(mu) + X ) ]
simdata[, h_iv := predict(fit_h)]
simdata[, c_iv := predict(fit_c)]

sfit_iv = simdata[, lm( lw ~ h_iv + c_iv + X   )]

pander(sfit_iv)
Fitting linear model: lw ~ h_iv + c_iv + X
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.01775 1.567e-16 1.132e+14 0
h_iv 0.8212 1.878e-16 4.374e+15 0
c_iv 1.504 2.685e-16 5.602e+15 0
X 1.011 1.282e-16 7.886e+15 0
Fork me on GitHub