Suppose we’re at the point \((a,b)\) and we want to find the rate of change as we move in the direction of the unit vector \(\uv u=(\cos\theta,\sin\theta)\text{.}\) This is called the directional derivative of \(f\) in the direction of \(\uv u\text{,}\) and is written \(\DS\pdv{f}{\uv u}\text{.}\)
If we think of a function of multiple variables \(f(x,y)\) as a function with a vector input \(f(\bv{r})\) with \(\bv{r}=(x,y)\text{,}\) we can also write this as:
Let \(\DS D(\theta)=\pdv{f}{\uv{u}}=\pdv{f}{x}\cos\theta + \pdv{f}{y}\sin\theta\text{.}\) Then \(\DS\dv{D}{\theta}=-\pdv{f}{x}\sin\theta + \pdv{f}{y}\cos\theta\text{.}\) Setting this equal to zero gives \(\DS\tan\theta=\dfrac{\pdv*{f}{y}}{\pdv*{f}{x}}\text{.}\) Any vector \(\bv{v}\) for which this holds is parallel to the vector \(\DS\qty(\pdv{f}{x},\pdv{f}{y})\text{.}\) If \(\theta\) points such that \(\bv{v}\) points in the same direction as \(\DS\qty(\pdv{f}{x},\pdv{f}{y})\text{,}\) then the directional derivative is maximized.
Define the gradient vector as the one whose direction points toward the direction of greatest increase and whose magnitude is the value of the directional derivative in that direction. In Cartesian coordinates we then have:
Notice the directional derivative is then the dot product of the gradient and the unit vector! Thus the directional derivative can be written \(\DS\pdv{f}{\uv{u}}=\Grad f\cdot\uv{u}\text{.}\)
Since the directional derivative is a dot product, we can also reason that it’s maximized when \(\uv{u}\) points in the same direction as the gradient, and minimized when \(\uv{u}\) points in the opposite direction as the gradient.