code
share


â–º Chapter 15: Recurrent networks

15.4 Variable order dynamic systems

In this Section we discuss variable order dynamic systems. As with those dynamic systems we have seen thus far, these are widely used in practice in variuos areas of science and engineering - notably in physics, feedback systems, and (once again) time series / signal processing. As their name implies, unlike those systems we have seen previously the order of these systems varies from step to step.

In [ ]:

15.4.1 A prototypical example: the exponential average¶

In Section 15.2.1 we discussed the moving average - a filtering or smoothing technique that is a prototypical example of a dynamic system with fixed order. Here we discuss an analagous smoothing technique called exponential averaging - that equally well provides a prototype for dynamic systems with variable order.

The exponential average is another smoothing technique - often applied to time series data as a pre-processing step to make it easier to further analyze and work with. Instead of taking a sliding window and averaging the input series inside of it we compute the average of the entire input sequence in an online fashion, adding the contribution of each input one element at-a-time. To do this we form an average of the first two points first two points $x_1$ and $x_2$ of an ordered input sequence (like the time series below) $x_1,\,x_2,...,x_P$. We then take this result, and make a weighted combination of it and the third point $x_3$ giving an average of the first three points. We continue in this fashion until the final element of the sequence $x_P$ is reached.

Below this process is animated on top of the original input time series - with the resulting exponential average shown as a pink curve. Note that the first average point - set to be the first point of the input series $x_1$ - is shown as a pink dot.

In [26]:
Out[26]:



Below we animate the same running average process as shown above for the same input series - but for only the first 50 points.

In [24]:
Out[24]:



Before we see how the exponential average is computed lets first see how to compute a running average of $T$ input points $x_1,\,x_1,\,...,x_T$, that is the average of the first two points, the average of the first three points, and so forth. Naively we could write down this running average as so

\begin{array} \ \text{average of the first $1$ elements:} \,\,\,\,\,\,\,\,\,\,\,\,\, h_1 = x_1 \\ \text{average of the first $2$ elements:} \,\,\,\,\,\,\,\,\,\,\,\,\, h_2 = \frac{x_1 + x_2}{2} \\ \text{average of the first $3$ elements:} \,\,\,\,\,\,\,\,\,\,\,\,\, h_3 = \frac{x_1 + x_2 + x_3}{3} \\ \text{average of the first $4$ elements:} \,\,\,\,\,\,\,\,\,\,\,\,\, h_4 = \frac{x_1 + x_2 + x_3 + x_4}{4} \\ \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \vdots \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \vdots \\ \text{average of the first $t$ elements:} \,\,\,\,\,\,\,\,\,\,\,\,\, h_{t} = \frac{x_1 + x_2 + x_3 + x_4 + \cdots + x_t}{t} \\ \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \vdots \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \vdots \\ \end{array}

Notice how at each step here the average computation $h_t$ summarizes the input points $x_1$ through $x_1$ via a simple summary statistic: their sample mean. Also note: this extremely simple running calculation is indeed a dynamic system - but it does not have a fixed order: at each step the order of the dynamic system increases by $1$.

However we can write this down more effeciently - as a dynamic system - by studying the form of each computation, where we can see that each average is dependent on its predecessor. For example notice how the average of $x_1$ and $x_2$ can be written as

\begin{equation} h_2 = \frac{x_1 + x_2}{2} = \frac{h_2 + x_2}{2} = \frac{1}{2}h_1 + \frac{1}{2}x_2. \end{equation}

Likeiwse, looking at the next line we can write $h_4$ in terms of its predecessor $h_3$ as

\begin{equation} h_3 = \frac{x_1 + x_2 + x_3}{3} = \frac{2\frac{x_1 + x_2}{2} + x_3}{3} = \frac{2h_2 + x_3}{3} = \frac{2}{3}h_2 + \frac{1}{3}x_3. \end{equation}

We could go on, noting how we can similarly express $h_4$ in terms of $h_3$ and $x_4$, and in general we would discover a pattern for expresing the $t^{th}$ average $h_{t}$ in the same sort of way as

\begin{equation} h_{t} = \frac{t-1}{t}h_{t-1} + \frac{1}{t}x_t. \end{equation}

What is the benefit of writing the running average in this way, as opposed to the naive way given above? Imagine we computed the running average one update at a time from $h_1$ all the way to $h_t$. When compared to the naive way of computing the average

\begin{equation} h_{t} = \frac{x_1 + x_2 + x_3 + x_4 + \cdots + x_t}{t} \end{equation}

here we need to know the explicit values of every value up to and including $x_t$ (that is $x_1$ through $x_{t}$), whereas with the clever way of writing the same average above we only need to access two values: $x_t$ and $h_{t-1}$. From a computational perspective, this is far more memory effecient (since at each step we only need to store and deal with two values as opposed to $t$ of them).

The exponential average is a simple generalization of this running average formula. Notice that at every step in the clever recursion in equation for the moving average in (3) that the coeffecients on the two terms in the update always sum to $1$: that is $\frac{t-1}{t} + \frac{1}{t} = 1$ for all $t$ always. At each step both coeffecients change as $\frac{t-1}{t}$ and $\frac{1}{t}$ respectively, and the to create the exponential average we use the same update formula but freeze the coeffecients. That is, using the same initial value $h_1 = x_1$ we take a value $\alpha \in \left[0,1\right]$ and create a running exponential average via the similar formula

\begin{equation} h_{t} = \alpha \, h_{t-1} + \left(1 - \alpha\right)x_t. \end{equation}

Why is this slightly adjusted version of equation (3) called an exponential average? Because if we 'roll back' the update shown above back to its initial condition - expressing each update step $h_t$ in terms of all its preceeding elements (analagous to the 'roll back' we performed with the exponential growth system detailed in Example 3 of the previous Section) - we can see the exponential relationship explicitly. We show this algebraic manipulation below. Needless to say with this update - as with the running average - each and every step $h_t$ summarizes the elements of the input that preceeds it ($x_1$ through $x_t$) via an (exponential) average.

By using the exponential averaging formula above - and substituting in the value of each preceeding value $h_{t-1}$, all the way back to $h_1$ - we can 'roll back' the exponential average at each step so that $h_t$ is expressed entirely in terms of the input values $x_1$ through $x_t$ preceeding it. For example, substituting in the same formula for $h_{t-1} = \alpha \, h_{t-2} + \left(1 - \alpha\right)x_{t-1}$ into the right hand side above for $h_t$ gives after simplifying

\begin{equation} h_t = \alpha \, h_{t-1} + \left(1 - \alpha\right)x_t = \alpha\left(\alpha\, h_{t-2} + \left(1 - \alpha\right)x_{t-1}\right) + \left(1 - \alpha\right) x_t = \alpha^2 \, h_{t-2} + \alpha \left(1 - \alpha\right)x_{t-1} + \left(1 - \alpha\right) x_t. \end{equation}

If we continue doing this we can 'roll back' all the way to the initial condition - which expresses our step $h_t$ in terms of every input sequence value $x_1$ through $x_t$ as

\begin{equation} h_{t} = \alpha^{\,t} \, x_1 + \alpha^{\,t-1}\,\left(1 - \alpha\right) x_{2} + \alpha^{\,t-2}\left(1 - \alpha\right)x_3 + \cdots + \alpha\,\left(1 - \alpha\right)x_{t-1} + \left(1 - \alpha\right)x_t \end{equation}

With these formulae can make a table of unrolled 'naive' versions of our exponential averages - ending much in the same way we began our formal description of the running average above. This listing of 'naive' formulae for the exponential averages looks like the following.

\begin{array} \ \text{exponential average of the first $1$ elements:} \,\,\,\,\,\,\,\,\,\,\,\,\, h_1 = x_1 \\ \text{exponential average of the first $2$ elements:} \,\,\,\,\,\,\,\,\,\,\,\,\, h_2 = \alpha \, x_1 + \left(1 - \alpha\right) x_2 \\ \text{exponential average of the first $3$ elements:} \,\,\,\,\,\,\,\,\,\,\,\,\, h_3 = \alpha^2 \, x_1 + \alpha\,\left(1 - \alpha\right) x_2 + \left(1 - \alpha\right)x_3 \\ \text{exponential average of the first $4$ elements:} \,\,\,\,\,\,\,\,\,\,\,\,\, h_4 = \alpha^3 \, x_1 + \alpha^2\,\left(1 - \alpha\right) x_2 + \alpha\left(1 - \alpha\right)x_3 + \left(1 - \alpha\right)x_4\\ \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \vdots \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \vdots \\ \text{exponential average of the first $t$ elements:} \,\,\,\,\,\,\,\,\,\,\,\,\,\,\, h_{t} = \alpha^{\,t} \, x_1 + \alpha^{\,t-1}\,\left(1 - \alpha\right) x_{2} + \cdots + \alpha\,\left(1 - \alpha\right)x_{t-1} + \left(1 - \alpha\right)x_t \\ \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \vdots \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \vdots \\ \end{array}

Of course just as with the running average we would prefer our simple update step

\begin{equation} h_{t} = \alpha \, h_{t-1} + \left(1 - \alpha\right)x_t \end{equation}

over such a long list of formulae. Moreover unwravelling the exponential average shows us that:

  • The exponential average (like the running average) is an effecient way of expressing - at each step of computation - a statistic over all of the preceeding input effeciently using only two values $h_{t-1}$ and $x_t$. In other words, $h_t$ summarizes the input $x_1$ through $x_t$ via a statistic: its exponential mean.
  • The exponential average is a dynamic system whose order changes at each step (in particular it increases by $1$ element at each step).

15.4.2 The generic variable order dynamic system¶

The basic form of a dynamic system with variable order looks very much like the exponential average (and running average) detailed above, but employs in general a function $f\left(\cdot\right)$ - which need not be a mere linear combination - and can be written out formally as

\begin{array} \ h_{t} = \gamma_t \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, t=1 \\ h_{t} = f\left(h_{t-1},x_t\right) \,\,\,\,\, \,\,\,\,\,\,\, t = 2,...,T. \\ \end{array}

Note that this is a generic form - and there are many variations on this basic theme. However all dynamic systems with variable order share the general characteristics introduced above in the context of the exponential average.

The examples above showcase several universal properties of variable order dynamic systems (properties we will see in further examples below as well). In particular

1. The step $h_t$ provides a summary of all preceeding input values $x_1$ through $x_{t}$.

In the case of the running exponential averages detailed above, in each instance $h_t$ summarized its corresponding input sequence $x_1$ through $x_t$ by averaging their values. Because of this $h_t$ is often referred to as the state variable in the context of variable order dynamic systems - since it provides a summary of the corresponding input sequence at each step of the system. We prove this fact more rigorously after discussing several additional examples below.

2. The step $h_t$ is a recurrence relation in itself.

While each update $h_t$ is a function of all input values that preceed it, it is also recursive in itself. Thus we can have variable order systems that have order $\mathcal{O} > 1$ as well, where $h_t$ is dependent on more than $1$ of its predecessors.

3. The order of a variable order dynamic system increases from iteration-to-iteration.

Just as with the running and exponential average examples detailed above, the generic variable order dynamic system shown above increases from iteration-to-iteration. In particular the order of this generic form increass by $1$ at each iteration.

Before diving into any more formal details, lets examine a few additional examples of these dynamic systems.

Example 1. The exponential average - examining a range of values¶

Below we animate a series of exponential averages (each shown in magenta) for the time series shown above, over the range of values $\alpha \in \left[0.75,1\right]$, and as you move the slider left to right the value increases. Once the value of $\alpha$ starts reaching close to $1$ notice how the average starts to lag heavily. This is because at this point we are weighing the previously computed average almost completely, and adding only a minute contribution of each subsequent point.

In [33]:
Out[33]:



Example 2. The running sum: a simple model of a savings account balance¶

What is simpler than the running average of an input sequence $x_1,\,x_2,\,...,x_T$? How about a simple running sum of these elements. Instead of taking the average at the $t^{th}$ step we simply sum up the first $t$ values of the input sequence, which we can write compactly as variable order dynamic system (very similarly to the running average described previously) as

\begin{array} \ h_1 = x_1 \\ h_{t} = h_{t-1} + x_{t} \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, t = 1,...,T. \end{array}

Just as with the two averaging examples given at the start of this Section, here by writing this running sum compactly we eliminate redundant computation. Moreover the variable $h_t$ summarizes the input sequence $x_1$ through $x_t$ by summing these values.

A running sum like this is often used as a simple model of yuor savings account balance. Suppose that we measure our income (that is - our pay minus all expenses) at the end of each month for $T$ consecutive months, and denote this by the sequence $x_1,\,x_2,\,...,x_T$. At the end of each period we take any positive income we gained from the previous one and put it into savings (for simplicity we assume we gain no interest on this). If our income was negative, our expenses were greater than our monthly income, we dip into savings to cover our costs.

At month $t=1$ suppose we start off with zero dollars in savings, which we denote by $h_1 = x_1$. At the start of the second month $t=2$ we add our second month's income $x_2$ to our current savings account value $h_1$

\begin{equation} h_2 = h_1 + x_2. \end{equation}

Note that $x_2$ could be negative here, so for the sake of simplicity suppose that we can have 'negative' savings (i.e., we can borrow money from the bank interest free).

At the end of the third month we again tally up our total monthly income $x_3$ and add / subtract it from our savings account balance, giving

\begin{equation} h_3 = h_2 + x_3. \end{equation}

And we can go on and on producing an entire sequence of savings measurements $h_1,\,h_2,\,...,h_T$ via this running sum dynamic system

\begin{array} \ h_1 = x_1 \\ h_{t} = h_{t-1} + x_{t} \,\,\,\,\,\, t = 2,...,T. \end{array}

Here the state variable $h_{t}$ summarizes the input sequence - the monthly incomes - up to time $t$ as a literal sum of all incomes up to that time period $x_1$ through $x_t$.

Below we simulate and then plot (in blue) a sequence of $T$ monthly incomes uniformly on the interval $\left[-1,+1\right]$, and generate the associated savings history (shown in magenta).

In [36]:
Out[36]:
<matplotlib.legend.Legend at 0x121f06a58>

Example 3. The running Riemann sum¶

In the instance that $x_1,\,x_2,\,...,x_T$ are $T$ ordered evaluations of a function spaced $\frac{1}{D}$ apart, a slight adjustment to the running sum gives an approximation to the one-dimensional integral or 'area under the curve', known as a Riemann sum. As illustrated in the figure below the Riemann sum approximates the area under a curve by a series of equally spaced rectangles whose heights are equally spaced evaluations of the function.

PICTURE GOES HERE

The Riemann sum of a function up to the $t^{th}$ evaluation $x_t$ is just the sum of the area of the rectangles defined by it and its predecessors, that is

\begin{equation} h_{t} = \frac{1}{D}x_1 + \frac{1}{D}x_2 + \cdots + \frac{1}{D}x_{t-1} + \frac{1}{D}x_{t} \end{equation}

which - like the running sum (here we are just multiplying the same step by $\frac{1}{D}$) - can be defined in terms of its predecessor simply as

\begin{equation} h_{t} = \left(\frac{1}{D}x_1 + \frac{1}{D}x_2 + \cdots + \frac{1}{D}x_{t-1}\right) + \frac{1}{D}x_{t} = h_{t-1} + \frac{1}{D}x_t. \end{equation}

So the running Riemann sum can be written very similarly to the running sum as

\begin{array} \ h_1 = \frac{1}{D}x_1 \\ h_{t} = h_{t-1} + \frac{1}{D}x_{t} \,\,\,\,\,\, t = 2,...,T. \end{array}

Here the state variable $h_{t}$ summarizes the input from $x_1$ through $x_{t-1}$ in that it precisely the Reimann sum of the rectangles with these heights.

In [52]:
Out[52]:
<matplotlib.legend.Legend at 0x12f950c18>

Example 4. Running product¶

We can just as easily show that a running product of the numbers $x_1,\,x_1,\,...,x_T$ can also be written as a variable order dynamic system as we can a running average or sum. Setting the initial condition $h_1 = 1$ the $t^{th}$ such product could be written naively as

\begin{equation} h_{t} = x_1\cdot x_2 \cdots x_{t-1}\cdot x_t. \end{equation}

However just as with the running sum we can easily write this recursively in terms of its predecessor as

\begin{equation} h_{t} = h_{t-1} \cdot x_t. \end{equation}

When the sequence $x_1,\,x_2,\,..,x_T$ consists of the consecutive integers $1,2,...,T$ then this is precisely the factorial, which we would commonly express as $h_t = t!$. In any case, here the state variable $h_t$ summarizes the first $t$ inputs via their product.

Example 5. Historical maximum¶

We could jsut as easily compute the historical maximum of an input series using the dynamic system update

\begin{equation} h_t = \text{max}\left(h_{t-1},x_t\right). \end{equation}

At each step $h_t$ summarizes the input sequence $x_1,\,x_2,\,...,x_t$ by recording the maximum value so far in this portion of the sequence.

Below we illustrate this historical maximum product for a sample sequence - here shown in blue. The dynamic system or state variable $h_t$ is shown in magenta. This is shown semi-transparent so that every point of the original series is also visible in the same plot.

In [185]:
Out[185]:
<matplotlib.legend.Legend at 0x19ddb4b70>

Example 5. Cheap guitar tuning - determining frequency by zero-cross counting¶

Many cheap guitar tuners work by feeding in an audio signal - which consists of a sine or sum of sine waves - and determining its pitch by counting the number of times the sine wave crosses zero over a short range of its input. The process of counting the number of non-zero crossings of a sine wave can be easily modeled as a variable order dynmaic system. For a centered and digitized sine wave like the one shown below in blue, we simply scan through the input sequence two units at a time looking for places where $x_{t-1} < 0$ and $x_t > 0$, or vice versa. Hence a dynamic system can be formed where $h_t$ is a running count of the number of zero crossings of the series as

\begin{equation} h_{t} = h_{t-1} + \mathcal{I}_{0}\left(x_{t},x_{t-1}\right) \end{equation}

where $\mathcal{I}_{0}$ is a simple indicator function that equals $1$ if the two points $x_{t-1}$ and $x_t$ straddle $0$, and is equal to zero otherwise.

In [102]:
Out[102]:
<matplotlib.legend.Legend at 0x15dce5550>

Example 6. Sending information over the radio¶

The basic idea of Frequency Modulated (FM) radio can be modeled using a simple addition to the running Riemann sum system in Example 3. As we saw there, the running Riemann sum can be written as a variable order dynamic system as

\begin{array} \ h_1 = \frac{1}{D}x_1 \\ h_{t} = h_{t-1} + \frac{1}{D}x_{t} \,\,\,\,\,\, t = 2,...,T. \end{array}

In the case of FM radio the sequene $x_t$ consists of certain information that we communicate by modualting the frequency of a sinusoid using its running Riemann sum. More specifically, we pass the state variable $h_t$ through a sinusoid creating an FM sinusoid $y_t$ as

\begin{equation} y_t = \text{sin}\left(h_t\right). \end{equation}

An example of such a Frequency Modulated sinusoid is shown in the animation below. In the top panel we show the raw information we wish to send via the FM signal. In the middle panel we show its running Riemann sum, and in the bottom the corresponding sinusoid. As you move the slider from left to right all three sequences progress.

In [192]:
Out[192]:




In each of the examples above we saw how the state variable $h_t$ provides a summary of all preceeding input $x_1$ through $x_t$. We can see that this is true for every variable order dynamic system by 'rolling back' the general update step - just as we did in the first Subsection above for the exponential average example. If we do so one time - plugging in the formula $h_{t-1} = f\left(h_{t-2},x_{t-1}\right)$ into the formula for $h_t$ we can see dependence on both $x_t$ and $x_{t-1}$

\begin{equation} h_{t} = f\left(f\left(h_{t-2},x_{t-1}\right),x_t\right) \end{equation}

Continuing in this fashion we can 'roll back' all the way to $h_1$

\begin{equation} h_{t} = f\left(f\left(f\left(\cdots f\left(h_{1},x_{2}\right),x_3\right)\cdots,x_{t-1}\right),x_t\right) \end{equation}

which exposes the fact that $h_t$ is dependent on all prior values $x_2$ to $x_t$, and $x_1$ as well if simply set the initial condition $h_1 = x_1$ (as we did with the exponential average above). In general then, when we say that '$h_t$ provides a summary of all preceeding input $x_1$ through $x_t$' we mean exactly the statement above.

Figure 1: Graphical model representations of both fixed order (top panel) and variable order (bottom panel) dynamic systems. In the former case when we 'roll back' the recursion we see - in the end - that every point in the system depends entirely on the system's initial condition(s). Conversely with variable order systems *every preceeding input* plays a role in the value of its next step, and is embedded in the system.

Note how much this characteristic differs from the case of the fixed order dynamic system described in the previous Section, where every value was dependent on only the initial conditions of the system. That complete dependency made fixed order dynamic systems extremely sensitve to their initial conditions - as we saw in several examples in the prior Section. Being dependent on all input data at each step variable order systems do not suffer this same fragility.