6.2 The geometric anatomy of lines and hyperplanes¶

In this Section we describe important characteristics of the hyperplane including the concept of the direction of steepest ascent and steepest descent.

Press the button 'Toggle code' below to toggle code on and off for entire this presentation.

In [5]:

from IPython.display import display
from IPython.display import HTML
import IPython.core.display as di # Example: di.display_html('<h3>%s:</h3>' % str, raw=True)

# This line will hide code by default when the notebook is exported as HTML
di.display_html('<script>jQuery(function() {if (jQuery("body.notebook_app").length == 0) { jQuery(".input_area").toggle(); jQuery(".prompt").toggle();}});</script>', raw=True)

# This line will add a button to toggle visibility of code blocks, for use with the HTML export version
di.display_html('''<button onclick="jQuery('.input_area').toggle(); jQuery('.prompt').toggle();">Toggle code</button>''', raw=True)

6.2.1 Single input hyperplanes¶

The formula for a line

\begin{equation} g(w) = a + bw \end{equation}

tells us - for specific choices of $a$ and $b$ - are the point at which it strikes or intersects the vertical axis (given by $a$) and the steepness or slope of that line (given by the coefficient $b$).

In [2]:

# create two quadratic functions
func1 = lambda w: 2*w
func2 = lambda w: -w + 2

# use custom plotter to show both functions
title1 = '$g(w)=$2w$'; title2 = '$g(w)=$-w+2$';
callib.plotter.double_2d_plot(func1 = func1, func2 = func2,title1 = title1,title2=title2,fontsize = 13,color = 'lime')

As a direction the slope is often referred to as the direction of steepest ascent - since it tells us the direction we must travel on the line to increase its value the fastest.

\begin{equation} \text{steepest ascent direction of a line} = \text{its slope} \,\, b. \end{equation}

This vector provides a simple visualization of a) the direction in which the line is increasing and b) how quickly it is increasing in this direction.

Notice how that - by the same logic - the value $-b$ provides the direction of steepest descent on the line. We also show this direction as a red vector in the animation.

\begin{equation} \text{steepest descent direction of a line} = \text{its negative slope} \,\, -b. \end{equation}

As a direction (shown in black) the slope is often referred to as the direction of steepest ascent - since it tells us the direction we must travel on the line to increase its value the fastest.

\begin{equation} \text{steepest ascent direction of a line} = \text{its slope} \,\, b. \end{equation}

This vector provides a simple visualization of a) the direction in which the line is increasing and b) how quickly it is increasing in this direction.

In [3]:

# animate 2d slope visualizer
func = lambda w: 2 + 3*w
callib.slope_visualizer.animate_visualize2d(func = func,num_frames = 50)

Out[3]:

Notice how that - by the same logic - the value $-b$ provides the direction of steepest descent on the line (shown in red).

\begin{equation} \text{steepest descent direction of a line} = \text{its negative slope} \,\, -b. \end{equation}

In [3]:

# animate 2d slope visualizer
func = lambda w: 2 + 3*w
callib.slope_visualizer.animate_visualize2d(func = func,num_frames = 50)

Out[3]:

In three dimensions we can form a similar equation using a single input, for example

\begin{equation} g(w_1,w_2) = a + bw_1 \end{equation}

This hyperplane still has a steepness or slope given by $b$.
The only difference is that this steepness is now defined over a two dimensional input space.

In [4]:

# plot a single input quadratic in both two and three dimensions
func1 = lambda w: 2-2*w 
func2 = lambda w: 2-2*w[0] 

# use custom plotter to show both functions
title1 = '$g(w)=$2-2w$'; title2 = '$g(w_1,w_2)=2-2w_1$';
callib.plotter.double_2d3d_plot(func1 = func1, func2 = func2,title1 = title1,title2=title2,fontsize = 18,color = 'lime')

Like the single input example, here we can visualize the directions of steepest ascent and descent as well.

For example, with the previous example of $g(w_1,w_2) = 2-2w_1$ the ascent direction is $\left(b_1,0\right) = (-2,0)$ and descent direction $-\left(b_1,0\right) = (2,0)$.

ascent: blue
descent red

In [5]:

# define hyperplane
func = lambda w:  2-2*w[0]

# animate 2d slope visualizer
callib.slope_visualizer.animate_visualize3d(func=func,num_frames=50)

Out[5]:

We can define this single input hyperplane along any dimension we want. In general if we have $N$ possible inputs $\mathbf{w}=[w_1,\,\,w_2,\,\,\cdots\,w_N]$ we can define it along the $n^{th}$ dimension as $g(\mathbf{w}) = a + bw_n$.

6.2.2 Constructing general hyperplanes when $N=2$¶

With multiple inputs we can form more complex hyperplanes by summing up a number of single input ones like those discussed above.

For example, with $N=2$ inputs if we form the two single input hyperplanes

\begin{array} \ g_1(w_1,w_2) = a_1 + b_1 w_1 \\ g_2(w_1,w_2) = a_2 + b_2 w_2 \\ \end{array}

Adding these together gives us a more complex hyperplane $g(w_1,w_2) = g_1(w_1,w_2) + g_2(w_1,w_2)= \left( a_1 + a_2 \right) + b_1w_1 + b_2w_2$ that has a slope along each input dimension explicitly controlled by its corresponding single input hyperplane.

direction of steepest ascent in each individual dimension: blue
overall direction of steepest ascent: black
overall direction of steepest descent: red

In [7]:

# define hyperplane
func = lambda w:  2 -2*w[0] - 2*w[1] 

# animate 2d slope visualizer
callib.slope_visualizer.animate_visualize3d(func=func,num_frames=50)

Out[7]:

In general for $N$ dimensional input we can define a single input hyperplane along each dimension

\begin{array} \ g_1(\mathbf{w}) = a_1 + b_1 w_1 \\ g_2(\mathbf{w}) = a_2 + b_2w_2 \\ \,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \vdots \\ g_N(\mathbf{w}) = a_N + b_N w_N \\ \end{array}

and summing them up as $g(\mathbf{w}) = \sum_{n=1}^N g_n(\mathbf{w})$ gives - collecting terms

\begin{equation} g(\mathbf{w}) = (a_1 + a_2 + \cdots a_N) + (b_1 w_1 + b_2 w_2 + \cdots b_N w_N) \end{equation}

We can write this formula more compactly using vector notation. Denoting the constant as

\begin{equation} a = \sum_{n=1}^{N} a_n \end{equation}

and the $\mathbf{b}$ the $N\times 1$ vector

\begin{equation} \mathbf{b} = \begin{bmatrix} b_1 \\ b_2 \\ \vdots \\ b_N \end{bmatrix} \end{equation}

we have our hyperplane written as

\begin{equation} g(\mathbf{w}) = a + \mathbf{w}^T\mathbf{b} \end{equation}

Again, such an $N$ dimensional hyperplane has a direction of steepest ascent given by $\mathbf{b}$. By the same logic, the vector $-\mathbf{b}$ gives the direction of steepest descent, or the fastest way to move downward on the hyperplane.

\begin{equation} \text{steepest ascent direction} = \text{entire vector of slope parameters} \,\, \mathbf{b}. \end{equation}\begin{equation} \text{steepest descent direction} = \text{negative entire vector of slope parameters} \,\, - \mathbf{b}. \end{equation}