Several Constraints & the Bordered Hessian

From one wall to the corner where walls meet

In the previous guide a single equality constraint g(x) = 0 trapped you on one curved surface, and the [[lagrange-multiplier|Lagrange multiplier]] rule said the only places worth checking are where nabla f is parallel to nabla g — where the level set of f just grazes the constraint surface. Geometrically, nabla g points straight off the surface, so the condition nabla f = lambda nabla g says: the part of f's gradient that lies along the surface has been cancelled to nothing, and there is no downhill direction left that you are allowed to take.

Now impose two constraints at once, g_1(x) = 0 and g_2(x) = 0. Each one is a surface; insisting on both pins you to where the surfaces cross. In three dimensions that intersection is no longer a surface but a curve — a wire bent through space. With three constraints in three variables it shrinks to isolated points. Every constraint you add eats one degree of freedom: with n variables and m independent constraints, you are confined to a feasible set of dimension n minus m. The optimisation problem has not gotten harder so much as narrower.

Picture standing in the corner where two walls of a room meet. To stay legal you must keep your back against both walls at once, so you can only slide along the vertical seam where they join. A ball pushed against that seam settles where gravity has no component left along the seam — but now there are two walls each pushing back, and the ball's weight is held up by some combination of both pushes. That balance of several forces is exactly what the multiple-constraint rule will write down.

The rule for several constraints

Here is the generalisation. At a constrained extremum subject to g_1 = 0, ..., g_m = 0, the gradient of the objective must be a linear combination of the constraint gradients: nabla f = lambda_1 nabla g_1 + lambda_2 nabla g_2 + ... + lambda_m nabla g_m. Each constraint earns its own multiplier lambda_i. This is the heart of the [[multiple-constraints|multiple-constraint Lagrange method]], and it says exactly what the corner picture suggested — the objective's gradient is propped up entirely by the constraint forces, with no leftover component pointing along the feasible set.

There is an honest condition hiding here, and it is easy to skip. The rule is only guaranteed at a regular point — one where the constraint gradients nabla g_1, ..., nabla g_m are linearly independent. This is the constraint qualification. If two of your constraint surfaces are tangent, or one constraint secretly repeats another, their gradients can collapse onto the same line, the 'corner' degenerates, and the multipliers may fail to exist even at a genuine optimum. Independence of the constraint gradients is the multivariable echo of linear independence: it guarantees the feasible set really has the clean dimension n minus m you expect, with a well-defined tangent space.

In practice you never juggle the vector equation by hand. You fold everything into one Lagrangian function and hunt for its stationary point. Define L(x, lambda_1, ..., lambda_m) = f(x) - lambda_1 g_1(x) - ... - lambda_m g_m(x). Setting all the partial derivatives of L to zero recovers two things at once: differentiating in the original variables x reproduces nabla f = sum lambda_i nabla g_i, and differentiating in each lambda_i simply reproduces the constraint g_i = 0. So one stationarity condition on L bundles the balance of forces and the feasibility requirements into a single system to solve.

Build the Lagrangian L = f - lambda_1 g_1 - ... - lambda_m g_m, one multiplier per constraint.
Take the partial of L with respect to every original variable and set each to zero — that is the balance-of-gradients condition.
Take the partial of L with respect to every multiplier and set each to zero — this just restores the original constraints g_i = 0.
Solve the whole system together for the variables and the multipliers; each solution is a constrained critical point — a candidate, not yet a verdict.

Why the ordinary second-derivative test breaks

Solving the system gives you candidate points, but it never tells you which are minima, which are maxima, and which are neither. In unconstrained problems you settled this with the second-derivative test: form the Hessian of second partials and check its definiteness — positive definite means a bowl (a minimum), negative definite a dome (a maximum), indefinite a saddle. It is tempting to just apply that same test to the Lagrangian and be done. That temptation is a trap.

The reason it fails is subtle and worth seeing clearly. On a constrained problem you do not care how f curves in every direction — only how it curves along the feasible set, the directions you are actually allowed to move. A point can look like a saddle of the full Lagrangian, curving up some forbidden way and down another, yet be a perfectly good minimum once you confine yourself to the constraint curve. The plain Hessian answers the wrong question; it includes directions that step off the constraint, directions the problem forbids. You need a curvature test that has been told which directions are legal.

The bordered Hessian

The fix is the [[bordered-hessian|bordered Hessian]], and the name says how it is built. Start with the Hessian of the Lagrangian L taken in the original variables alone — the second partials of L with respect to x_i and x_j. Then border it: wrap rows and columns made of the first partials of the constraints around the outside, and put a block of zeros in the corner where the constraints meet each other. For one constraint g in two variables the layout is a 3-by-3 matrix; for m constraints in n variables it is an (n + m)-by-(n + m) matrix with an m-by-m zero block in the top-left corner.

One constraint g, two variables x, y.  Border with the gradient of g:

          |   0      g_x     g_y  |
   H_b =  |  g_x    L_xx    L_xy  |
          |  g_y    L_yx    L_yy  |

  g_x, g_y   = first partials of the constraint   (the BORDER)
  L_xx ...   = second partials of the Lagrangian  (the inner Hessian)
  corner 0   = constraints have no second-order self-term

 m constraints, n variables  ->  (n+m) x (n+m), with an m x m zero block.

Layout of the bordered Hessian: the constraint gradients wrap an outer border around the Lagrangian's inner Hessian, with zeros in the corner.

Why does bordering accomplish the 'delete the forbidden directions' job from the last section? Because the constraint-gradient rows and columns act as a filter. When you compute the determinants of this matrix, the border algebraically projects the inner Hessian down onto the tangent space of the constraints — it quietly throws away curvature in the directions you are not allowed to move and keeps only the curvature you can actually feel. The bordered Hessian is, in disguise, the ordinary Hessian restricted to the feasible set. You are testing the bowl-or-dome question, but only along legal directions.

Reading the sign of the minors

The verdict comes from the signs of the leading principal minors — the determinants of the top-left square blocks — of the bordered Hessian. The bordering shifts everything, so the rule is not the simple 'all positive for a minimum' you used in the unconstrained case. With n variables and m constraints, you look only at the last n minus m leading minors (the small ones are forced by the zero corner and carry no information), and you check them against a pattern that depends on m.

Take the cleanest case to anchor the pattern: one constraint, two variables, so the 3-by-3 matrix above and a single minor that matters, its full determinant. If det(H_b) is positive, the constrained critical point is a local maximum; if det(H_b) is negative, it is a local minimum. Notice the sign feels backwards from the unconstrained test — a positive determinant signals a maximum here — and that flip is not a typo but a direct consequence of the zero in the corner. For a maximum the relevant minors alternate in sign starting one way; for a minimum they all share the sign of (-1)^m. When the minors land on zero the test is inconclusive, exactly as the unconstrained test goes silent on a degenerate Hessian.

What this opens up next

Step back and the shape of the whole method is clear. To optimise under several equality constraints you write one Lagrangian, set every partial to zero to find candidate points where the gradients balance, and then read the bordered Hessian to label each candidate a max, a min, or neither — all while honouring the constraint qualification that keeps the constraint gradients independent. The first-order condition finds the candidates; the second-order condition judges them. This is the same two-act story you already know from single-variable optimisation, now playing out on a curved, lower-dimensional feasible set.

One quiet limitation points straight at the next guide. Everything here assumed equality constraints — surfaces you must stand exactly on. Real engineering far more often hands you inequalities: a budget you must not exceed, a stress you must stay below, a length that cannot be negative. There the feasible set is a solid region with a boundary, and the optimum may sit deep inside (where the constraint does nothing and ordinary stationarity rules) or press right up against an edge (where it behaves like an equality). Sorting out which constraints are active is the leap to the inequality-constrained world and the Karush-Kuhn-Tucker conditions you meet next.