A function inside a function
First, name the thing the rule is for. A composite function is a function applied to the output of another function — you do g first, then feed its result into f. We write it f(g(x)), read 'f of g of x'. For example sin(x^2) is the composite where the inside is g(x) = x^2 and the outside is f(u) = sin(u): square x first, then take the sine of that. Spotting which part is inside and which is outside is the whole battle; once you see it, the rule almost writes itself.
Linked rates of change
Before the formula, the picture. Imagine three quantities linked in a chain: y depends on u, and u depends on x. Suppose u changes 3 times as fast as x (so a nudge in x of 1 moves u by 3), and y changes 2 times as fast as u. Then how fast does y change as x moves? Twice for every unit of u, and u moves 3 per unit of x, so y moves 2 times 3 = 6 per unit of x. The rates of change simply multiply along the chain. That is the entire idea of the chain rule, in gears: a small gear driving a bigger one passes its turning on, and the speeds compound.
In Leibniz notation that reads beautifully: if y = f(u) and u = g(x), then dy/dx = dy/du times du/dx. It looks as if the du's 'cancel', and while that is not literally a fraction trick, it is an honest reflection of the limit underneath. In prime notation the same statement is (f(g(x)))' = f'(g(x)) times g'(x). Say it as a phrase you will never forget: 'derivative of the outside (leaving the inside alone) times the derivative of the inside.'
Three worked examples
Let us run the phrase on the three functions you have been promised. Each time: peel off the outside, differentiate it while keeping the inside untouched, then multiply by the derivative of the inside. The recipe never changes.
- (3x+1)^5. Outside is f(u) = u^5 (by the power rule f'(u) = 5u^4); inside is g(x) = 3x+1 with g'(x) = 3. So the derivative is 5(3x+1)^4 times 3 = 15(3x+1)^4. Note we kept the inside 3x+1 untouched inside the power, then tacked on the 3.
- sin(x^2). Outside is f(u) = sin(u) with f'(u) = cos(u); inside is g(x) = x^2 with g'(x) = 2x. So the derivative is cos(x^2) times 2x = 2x cos(x^2). Beware: it is NOT cos(2x), and it is NOT cos(x^2) by itself — the 2x factor must be there.
- e^{kx}, with k a constant. Outside is f(u) = e^u with f'(u) = e^u; inside is g(x) = kx with g'(x) = k. So the derivative is e^{kx} times k = k e^{kx}. This single result powers all of exponential growth and decay — the rate is k times the current amount.
h(x) = (3x + 1)^5 outer = u^5 -> 5u^4 (leave u = 3x+1 alone) inner = 3x + 1 -> 3 h'(x) = 5(3x+1)^4 * 3 = 15(3x+1)^4
Stacking it up, and what comes next
The chain rule nests as deeply as the function does. For sin(e^{3x}) there are three layers — sin of e-to-the of 3x — so you multiply three rates: cos(e^{3x}) times e^{3x} times 3. Work from the outermost shell inward, peeling one layer at a time and multiplying the new inside derivative on as you go. And the chain rule combines with the product and quotient rules too: to differentiate x^2 sin(x^3), use the product rule on the two factors, and reach for the chain rule when you hit sin(x^3) inside.
With the chain rule in hand you now have the complete differentiation toolkit for the functions you meet day to day. Next we will use it in reverse and in disguise: implicit differentiation, where the chain rule lets you differentiate equations like x^2 + y^2 = 1 without first solving for y, and related rates, where two quantities changing in time are linked by exactly this multiply-along-the-chain idea.