Truncated normal hurdle model

In econometrics, the truncated normal hurdle model is a variant of the Tobit model and was first proposed by Cragg in 1971.[1]

In a standard Tobit model, represented as y = ( x β + u ) 1 [ x β + u > 0 ] {\displaystyle y=(x\beta +u)1[x\beta +u>0]} , where u | x N ( 0 , σ 2 ) {\displaystyle u|x\sim N(0,\sigma ^{2})} This model construction implicitly imposes two first order assumptions:[2]

  1. Since: P [ y > 0 ] / x j = φ ( x β / σ ) β j / σ {\displaystyle \partial P[y>0]/\partial x_{j}=\varphi (x\beta /\sigma )\beta _{j}/\sigma } and E [ y x , y > 0 ] / x j = β j { 1 θ ( x β / σ } {\displaystyle \partial \operatorname {E} [y\mid x,y>0]/\partial x_{j}=\beta _{j}\{1-\theta (x\beta /\sigma \}} , the partial effect of x j {\displaystyle x_{j}} on the probability P [ y > 0 ] {\displaystyle P[y>0]} and the conditional expectation: E [ y x , y > 0 ] {\displaystyle \operatorname {E} [y\mid x,y>0]} has the same sign:[3]
  2. The relative effects of x h {\displaystyle x_{h}} and x j {\displaystyle x_{j}} on P [ y > 0 ] {\displaystyle P[y>0]} and E [ y x , y > 0 ] {\displaystyle \operatorname {E} [y\mid x,y>0]} are identical, i.e.:
P [ y > 0 ] / x h P [ y > 0 ] / x j = E [ y x , y > 0 ] / x h E [ y x , y > 0 ] / x j = β h β j | {\displaystyle {\frac {\partial P[y>0]/\partial x_{h}}{\partial P[y>0]/\partial x_{j}}}={\frac {\partial \operatorname {E} [y\mid x,y>0]/\partial x_{h}}{\partial \operatorname {E} [y\mid x,y>0]/\partial x_{j}}}={\frac {\beta _{h}}{\beta _{j}}}|}

However, these two implicit assumptions are too strong and inconsistent with many contexts in economics. For instance, when we need to decide whether to invest and build a factory, the construction cost might be more influential than the product price; but once we have already built the factory, the product price is definitely more influential to the revenue. Hence, the implicit assumption (2) doesn't match this context.[4] The essence of this issue is that the standard Tobit implicitly models a very strong link between the participation decision ( y = 0 {\displaystyle (y=0} or y > 0 ) {\displaystyle y>0)} and the amount decision (the magnitude of y {\displaystyle y} when y > 0 {\displaystyle y>0} ). If a corner solution model is represented in a general form: y = s w , {\displaystyle y=s\centerdot w,} , where s {\displaystyle s} is the participate decision and w {\displaystyle w} is the amount decision, standard Tobit model assumes:

s = 1 [ x β + u > 0 ] ; {\displaystyle s=1[x\beta +u>0];}
w = x β + u . {\displaystyle w=x\beta +u.}

To make the model compatible with more contexts, a natural improvement is to assume:

s = 1 [ x γ + u > 0 ] ,  where  u N ( 0 , 1 ) ; {\displaystyle s=1[x\gamma +u>0],{\text{ where }}u\sim N(0,1);}

w = x β + e , {\displaystyle w=x\beta +e,} where the error term ( e {\displaystyle e} ) is distributed as a truncated normal distribution with a density as φ ( ) / Φ ( x β σ ) / σ ; {\displaystyle \varphi (\cdot )/\Phi \left({\frac {x\beta }{\sigma }}\right)/\sigma ;}

s {\displaystyle s} and w {\displaystyle w} are independent conditional on x {\displaystyle x} .

This is called Truncated Normal Hurdle Model, which is proposed in Cragg (1971).[1] By adding one more parameter and detach the amount decision with the participation decision, the model can fit more contexts. Under this model setup, the density of the y {\displaystyle y} given x {\displaystyle x} can be written as:

f ( y x ) = [ 1 Φ ( χ γ ) ] 1 [ y = 0 ] [ Φ   ( χ γ ) Φ ( χ β / σ ) φ ( y χ β σ ) / σ ] 1 [ y > 0 ] {\displaystyle f(y\mid x)=[1-\Phi (\chi \gamma )]^{1[y=0]}\cdot \left[{\frac {\Phi \ (\chi \gamma )}{\Phi (\chi \beta /\sigma )}}\left.\varphi \left({\frac {y-\chi \beta }{\sigma }}\right)\right/\sigma \right]^{1[y>0]}}

From this density representation, it is obvious that it will degenerate to the standard Tobit model when γ = β / σ . {\displaystyle \gamma =\beta /\sigma .} This also shows that Truncated Normal Hurdle Model is more general than the standard Tobit model.

The Truncated Normal Hurdle Model is usually estimated through MLE. The log-likelihood function can be written as:

( β , γ , σ ) = i = 1 N 1 [ y i = 0 ] log [ 1 Φ ( x i γ ) ] + 1 [ y i > 0 ] log [ Φ ( x i γ ) ] + 1 [ y i > 0 ] [ log [ Φ ( x i β σ ) ] + log ( φ ( y i x i β σ ) ) log ( σ ) ] {\displaystyle {\begin{aligned}\ell (\beta ,\gamma ,\sigma )={}&\sum _{i=1}^{N}1[y_{i}=0]\log[1-\Phi (x_{i}\gamma )]+1[y_{i}>0]\log[\Phi (x_{i}\gamma )]\\[5pt]&{}+1[y_{i}>0]\left[-\log \left[\Phi \left({\frac {x_{i}\beta }{\sigma }}\right)\right]+\log \left(\varphi \left({\frac {y_{i}-x_{i}\beta }{\sigma }}\right)\right)-\log(\sigma )\right]\end{aligned}}}

From the log-likelihood function, γ {\displaystyle \gamma } can be estimated by a probit model and ( β , σ ) {\displaystyle (\beta ,\sigma )} can be estimated by a truncated normal regression model.[5] Based on the estimates, consistent estimates for the Average Partial Effect can be estimated correspondingly.

See also

  • Hurdle model
  • Tobit model

References

  1. ^ a b Cragg, John G. (September 1971). "Some Statistical Models for Limited Dependent Variables with Application to the Demand for Durable Goods". Econometrica. 39 (5): 829–844. doi:10.2307/1909582. JSTOR 1909582.
  2. ^ Wooldridge, J. (2002): Econometric Analysis of Cross Section and Panel Data, MIT Press, Cambridge, Mass, pp 690.
  3. ^ Here, the notation follows Wooldridge (2002). Function θ ( x ) = λ {\displaystyle \theta (x)=\lambda '} where λ ( x ) = φ ( χ ) / Φ ( χ ) , {\displaystyle \lambda (x)=\varphi (\chi )/\Phi (\chi ),} can be proved to be between 0 and 1.
  4. ^ For more application example of corner solution model, refer to: Daniel J. Phaneuf, (1999): “A Dual Approach to Modeling Corner Solutions in Recreation Demand”,Journal of Environmental Economics and Management, Volume 37, Issue 1, Pages 85-105, ISSN 0095-0696.
  5. ^ Wooldridge, J. (2002): Econometric Analysis of Cross Section and Panel Data, MIT Press, Cambridge, Mass, pp 692-694.