In this Section we introduce the general framework of nonlinear regression, along with many examples ranging from toy datasets to classic examples from differential equations.
These examples are all low dimensional, allowing us to visually examine patterns in the data and propose appropriate nonlinearities, which we can (as we will see) very quickly inject into our linear supervised paradigm to produce nonlinear regression fits.
y walking through these examples we flush out a number important concepts in concrete terms, coding principles, and jargon-terms in a relatively simple environment that will be omnipresent in our discussion of nonlinear learning going forward.
Activate next cell to toggle code on and off
from IPython.display import display
from IPython.display import HTML
import IPython.core.display as di # Example: di.display_html('<h3>%s:</h3>' % str, raw=True)
# This line will hide code by default when the notebook is eåxported as HTML
di.display_html('<script>jQuery(function() {if (jQuery("body.notebook_app").length == 0) { jQuery(".input_area").toggle(); jQuery(".prompt").toggle();}});</script>', raw=True)
# This line will add a button to toggle visibility of code blocks, for use with the HTML export version
di.display_html('''<button onclick="jQuery('.input_area').toggle(); jQuery('.prompt').toggle();">Toggle code</button>''', raw=True)
model
or more compactly
\begin{equation} \text{model}\left(\mathbf{x},\mathbf{w}\right) = \mathbf{x}^T \mathbf{w}^{\,} \end{equation}where we use our 'compact' notation denoting
\begin{equation} \mathbf{w}=\begin{bmatrix} w_{0}\\ w_{1}\\ w_{2}\\ \vdots\\ w_{N} \end{bmatrix} \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \mathbf{x}=\begin{bmatrix} 1 \\ x_{1}\\ w_{2}\\ \vdots\\ w_{N} \end{bmatrix},\,\,\,\, \end{equation}model
compactly in Python
# compute linear combination of input point
def model(x,w):
# tack a 1 onto the top of each input point all at once
o = np.ones((1,np.shape(x)[1]))
x = np.vstack((o,x))
# compute linear combination and return
a = np.dot(x.T,w)
return a
model
notation our ideal linear relationshipmodel
used in the construction of our regression model
be a linear onewhere $f_1,\,f_2,\,...\,f_B$ are nonlinear parameterized or unparameterized functions - or feature transformations
model
# an implementation of the least squares cost function for linear regression
def least_squares(w):
cost = np.sum((model(x,w) - y)**2)
return cost/float(len(y))
model
- linear or nonlinear - so we can push it to the backmodel
function too, since will look essentially the same throughout all examplesmodel
will always look like# an implementation of our model employing a nonlinear feature transformation
def model(x,w):
# feature transformation
f = feature_transforms(x,w[0])
# tack a 1 onto the top of each input point all at once
o = np.ones((1,np.shape(f)[1]))
f = np.vstack((o,f))
# compute linear combination and return
a = np.dot(f.T,w[1])
return a
f = feature_transforms(x,w[0])
computes our desired feature transformationsmodel
implementation - just a few lines extra to compute features# compute linear combination of input point
def model(x,w):
# tack a 1 onto the top of each input point all at once
o = np.ones((1,np.shape(x)[1]))
x = np.vstack((o,x))
# compute linear combination and return
a = np.dot(x.T,w)
return a
model
as well, since it will always look similar regardless of linear/nonlinear featuresfeature_transforms
in Python
model
, and feed model
into e.g., least_squares
to optimizefeature_transforms
function for the simple case of linear regression## This code cell will not be shown in the HTML version of this notebook
# load data
csvname = datapath + 'unnorm_linregress_data.csv'
data = np.loadtxt(csvname,delimiter = ',')
x = data[:,:-1].T
y = data[:,-1:]
# plot dataset
demo = regress_plotter.Visualizer(data)
demo.plot_data()
and in this notation our model
is then equivalently
# the trivial linear feature transformation
def feature_transforms(x):
return x
## This code cell will not be shown in the HTML version of this notebook
# pluck out best weights - those that provided lowest cost,
# and plot resulting fit
ind = np.argmin(run.cost_history)
w_best = run.weight_history[ind]
demo.plot_fit(w_best,run.model,normalizer = run.normalizer);
## This code cell will not be shown in the HTML version of this notebook
# load data
csvname = datapath + 'noisy_sin_sample.csv'
data = np.loadtxt(csvname,delimiter = ',')
x = data[:,:-1].T
y = data[:,-1:]
# plot dataset
demo = regress_plotter.Visualizer(data)
demo.plot_data()
- all we need to do is define our nonlinear `feature_transforms` function
# our nonlinearity, known as a feature transformation
def feature_transforms(x,w):
# tack a 1 onto the top of each input point all at once
o = np.ones((1,np.shape(x)[1]))
x = np.vstack((o,x))
# calculate feature transform
f = np.sin(np.dot(x.T,w)).T
return f
# plot the cost function history for a given run
static_plotter.plot_cost_histories([run1.cost_history,run2.cost_history],start = 0,points = False,labels = ['original data','normalized'])
## This code cell will not be shown in the HTML version of this notebook
# pluck out best weights - those that provided lowest cost,
# and plot resulting fit
ind = np.argmin(run2.cost_history)
w_best = run2.weight_history[ind]
demo.plot_fit(w_best,run2.model,normalizer = run2.normalizer);
# plot data and fit in original and feature transformed space
demo.plot_fit_and_feature_space(w_best,run2.model,run2.feature_transforms,normalizer = run2.normalizer)
A properly designed feature (or set of features) provides a good nonlinear fit in the original feature space and, simultaneously, a good linear fit in the transformed feature space.
## This code cell will not be shown in the HTML version of this notebook
# load data
csvname = datapath + 'yeast.csv'
data = np.loadtxt(csvname,delimiter = ',')
# get input/output pairs
x = data[:,:-1].T
y = data[:,-1:]
# plot dataset
demo = regress_plotter.Visualizer(data)
demo.plot_data()
model
a linear combination of this nonlinear feature transformation as# our nonlinearity, known as a feature transformation
def feature_transforms(x,w):
# tack a 1 onto the top of each input point all at once
o = np.ones((1,np.shape(x)[1]))
x = np.vstack((o,x))
# calculate feature transform
f = np.tanh(np.dot(x.T,w)).T
return f
## This code cell will not be shown in the HTML version of this notebook
# plot data and fit in original and feature transformed space
ind = np.argmin(run1.cost_history)
w_best = run1.weight_history[ind]
demo.plot_fit_and_feature_space(w_best,run1.model,run1.feature_transforms,normalizer = run1.normalizer)
## This code cell will not be shown in the HTML version of this notebook
# load data
csvname = datapath + 'galileo_ramp_data.csv'
data = np.loadtxt(csvname,delimiter = ',')
# get input/output pairs
x = data[:,:-1].T
y = data[:,-1:]
# plot dataset
demo = regress_plotter.Visualizer(data)
demo.plot_data()
feature_transforms
functiondef feature_transforms(x):
# calculate feature transform
f = np.array([(x.flatten()**d) for d in range(1,3)])
return f
# plot data and fit in original and feature transformed space
ind = np.argmin(run1.cost_history)
w_best = run1.weight_history[ind]
demo.plot_fit_and_feature_space(w_best,run1.model,run1.feature_transforms,normalizer = run1.normalizer,view = [25,100])