25 September 2012

The Envelope Theorem

I learned in school the Kuhn-Tucker conditions and the envelope theorem but I always felt I didn't have a complete understanding of them. One thing confusing about the envelope theorem is that there are different varieties of it, depending on the flavor of the optimization problem. I decided to review envelope theorem today and keep the notes in the blog.

Consider a constrained optimization problem on the function f(x,a) with regard to x subject to a g(x,a) = 0, where x is an n-vector and a is a scalar. Let M(a) be the solution to the problem; that is,

M(a) = maxx f(x,a) s.t. g(x,a) = 0.

The Lagrangian is then L = f(x,a) − λg(x,a), giving n+1 first-order conditions (or, n first-order conditions and m complementary slackness conditions for more general constrained optimization problem with m inequality constraints). These conditions yield the optimizing argument x*(a) and the solution M(a) = f(x*(a), a).

The envelope theorem states that if x*(a) is a C1 function and the usual constraint qualification is satisfied (∇g(x*(a))≠0), then M'(a) = ∂L(x,a)/∂a evaluated at x = x*(a). To put crudely, to differentiate the solution with regard to a parameter (say for comparative statics), one only needs to differentiate the Lagrangian with respect to the parameter and then "plug in" the solution x* rather than explicitly find M(a) and then differentiate it.

The proof for this version of envelope theorem is a straight-forward calculation. Since M(a) = f(x*(a), a), it follows

M'(a) = Σ(∂f/∂xi)(∂xi/∂a) + ∂f/∂a
for i = 1,..., n.

By the first order conditions, ∂f/∂xi = λ∂g/∂xi for each i. It follows

M'(a) = λ Σ(∂g/∂xi)(∂xi/∂a) + ∂f/∂a

Identically, it must be that g(x*(a), a) = 0. Differentiating this equality with respect to a yields Σ(∂g/∂xi)(∂xi/∂a) + ∂g/∂a = 0. So we get

M'(a) = -λ∂g/∂a + ∂f/∂a evaluated at x = x*(a).
But of course, -λ∂g/∂a + ∂f/∂a = ∂L/∂a

The more general case with m inequality constraints have a similar flavor of proof: differentiate the maximum function with respect to the parameter, collect the terms, and then use the first order conditions and complementary slackness conditions to cancel out the terms. The theorem can be further generalized into the case in which the parameter is a k-vector rather than a scalar.