Thursday, April 23, 2015

Convolution with Deltas - A Word on Notation

Mathematical notation is important. This is true even more so if we consider the convolution between two functions or between two sequences. In particular, if $f$ and $g$ are two functions, we write for the convolution

$$ h(x) = (f * g)(x) $$

whereas if $\{a_n\}$ and $\{b_n\}$ are two sequences, the convolution can be written as

$$ {c[n]} = {a[n]}*{b[n]}. $$

What should not be done in any circumstance is to write the convolution of two functions as

$$ h(t) = f(t) * g(t). $$

Why not? Well, because the convolution operator $*$ does not operate on value but on functions; $f(t)$ and $g(t)$, however, are values. That can be easily seen by inserting $t=3$ in above equation. We get

$$ h(3) = f(e) * g(3) $$

which obviously does not make any sense. The equation

$$ h(3) = (f*g)(3) $$

is the only appropriate way of writing this. In stark contrast, take a different unitary operator, like addition. For addition one can easily argue that the value of the sum of two functions at a given point equals the sum of the values of the individual functions: Addition works on values, i.e.,

$$ h(t) = (f+g)(t) = f(t)+g(t). $$

Still, many excellent textbooks on signal processing (even the ones from Oppenheim and Schafer and from Vetterli) make use of this "abuse of notation" for convolution. The reason is that as soon a Dirac (or Kronecker) delta enters the game, the famous convolution property pops up somewhere. In incorrect notation, this property can be stated as follows:

$$ h(t)=f(t) * \delta(t-T) = f(t-T) $$

Finding a correct notation is not easy. I found a few helpful comments on stackexchange. I will add my idea at the bottom, please judge yourself which you like the most. I have to admit that actually none of these is really satisfactory...

  1. One commenter suggested to introduce the shift operator $T_T$, i.e., $T_T$ is the function mapping $t$ to $t-T$. Then, $\delta(t-T)=\delta(T_T(t))=(\delta\circ T_T)(t)$ and we get $h(t)=(f*(\delta\circ T_T))(t)$.
  2. Another commenter proposed to write $h(t) = (f*\delta(\cdot-T))(t)$.
  3. My idea is to define an ensemble of functions $\delta_T(t):=\delta(t−T)$ and then write the convolution as $h(t)=(f*\delta_T)(t)$.
While my comment might give the shortest notation, it is not as general as the one with the shift operator. I would have problems, e.g., to write the convolution of a function with its time-reversal, i.e., (in bad notation), $h(t)=f(t)*f(-t)$. If we introduce a time-reversal operator $R$ for which $R(t)=-t$, we could write the convolution easily as $h(t)=(f*f\circ R)(t)$.

Wednesday, April 1, 2015

Properties of a Special Iterated Function System (open)

This post is loosely coupled with my recent post on the fractality of polar codes; although the problem posed there is already solved (I might write about that later), there are still a few open questions which might be of independent interest.

Assume you are given two functions $g_0{:}\ [0,1]\to[0,1]$ and $g_1{:}\ [0,1]\to[0,1]$, which are defined as
$$ g_0(x)=2x-x^2 $$
and
$$ g_1(x)=x^2. $$

Now assume that we have a sequence $a^k:=(a_1,a_2,\dots,a_k)\in\{0,1\}^k$ which determines the polynomial
$$ p_{a^k}(x) := g_{a_k}(g_{a_{k-1}}(\dots g_1(x)\dots )). $$

We are now interested in the properties of this polynomial. Since the case where all $a_i$ are equal (i.e., all are either zero or one) is trivial to analyze, we will assume that $a^k$ contains zeros and ones. W.l.o.g. (see below) we may assume that $a_1=1$ and $a_2=0$, hence $p_{a^2}(x)=2x^2-x^4$. We are looking for the number of fixed points of $p_{a^k}$ for fixed $k$, i.e., points $x^*\in[0,1]$ which satisfy $x^*=p_{a^k}(x^*)$. Can't be that complicated, can it?

What do we know about $p_{a^k}$?

Only a few things, unfortunately:
  • Since $g_0$ and $g_1$ are from $[0,1]$ to $[0,1]$, so is $p_{a^k}$.
  • Since $g_0$ and $g_1$ are strictly monotonously increasing, so is $p_{a^k}$.
  • Since $g_0$ and $g_1$ have fixed points at 0 and 1, so has $p_{a^k}$.
  • For $a_1=1$ and $a_2=0$, the derivative of $p_{a^k}$ vanishes at 0 and 1; hence, these two fixed points are attractive.
  • $p_{a^k}$ is a polynomial in $x$ of order $2^k$; hence, by the fundamental theorem of algebra, it has exactly $2^k$ roots; hence, so has $p_{a^k}(x)-x$, which means that there are exactly $2^k$ fixed points.


What do we expect?

Well, this is easy: I think (and extensive numerical analyses confirmed that) $p_{a^k}$ has exactly one (repelling) fixed point on $(0,1)$. This can be understood intuitively as follows: The fixed point on $(0,1)$ is the intersection of $p_{a^k}$ with the identity function ($f(x)=x$), both of which pass through 0 and 1. Since the slope of $p_{a^k}$ is zero at 0 and 1, it must lie above the identity function for values close to 1 and below it for values close to 0. Hence, there should be one point in between where $p_{a^k}$ crosses the identity function. The properties of $p_{a^k}$ ensure that there can be only an odd number of fixed points on $(0,1)$.

What about the rest of the $2^k$ fixed points? They either have to lie outside $[0,1]$ or have to be complex (and conjugated).

What have I already tried?

Not much, I'm afraid, but I spend a few days on this problem. I tried essentially two different approaches, but neither lead (quickly) to the desired result:

The first approach relied upon mathematical induction: For $p_{a^2}$ it is easy to show that the derivative is a quasi-concave function, which admits the application of certain inequalities. If then $g_0(p_{a^k})$ and $g_1(p_{a^k})$ have quasi-concave derivatives, the problem will be solved. But unfortunately, this cannot be shown (there are counter-examples). The approach could be refined by using the fact that the derivative of $p_{a^2}$ is convex on intervals $[0,a]$ and $[b,1]$, but concave on the interval $[a,b]$ (for $a$ and $b$ chosen appropriately). That this property is preserved under applying $g_0$ or $g_1$ is not as easy to show. And even if we would succeed in that, it is not clear if this delivers the desired result.

This brings us to the second, more direct approach: It relies on counting the number of real roots of $p_{a^2}(x)-x$ in the interval $(0,1)$. While Descartes' rule of signs is of not much use here, and while Sturm's method is too difficult to apply in the general case, Fourier's theorem looks very promising, given that the derivatives need to be evaluated only at the fixed points 0 and 1. Since Fourier's theorem involves higher-order derivatives of a composition of functions, the most straightforward way would be to apply Faa di Bruno's formula to either $g_i(p_{a^k}(x))$ or to $p_{a^k}(g_i(x))$. "Straightforward" in this case does not mean "easy", and I had to quit my efforts soon (mental note: I should offer a student's project...).

A third, related approach is to show that the second derivative of $p_{a^k}$ has exactly one root in $(0,1)$, showing that $p_{a^k}$ has only one inflection point in this interval. This would do the job, but again requires tedious calculations with Faa di Bruno's formula.

What does this have to do with Polar Codes?

I'm glad that you ask. You might remember the example from my previous post:

Example: Consider the binary erasure channel and let $b=2/3$. Then, the binary expansion of $b$ is 101010101.... A binary erasure channel with capacity $C>(\sqrt{5}−1)/2$ will converge to a perfect channel, one with a capacity strictly lower than this number will converge to a useless channel, and one with a capacity exactly equal to this number will stay at this number.

Essentially, all rational numbers have binary expansions that are recurring, i.e., that repeat after a given period. We do not consider dyadic rationals here, although they can be called recurring, too. In general, i.e., if the rational is not dyadic, then the period will contain zeros and ones. And since we can analyze any period we want, we can choose the period such that $a_1=1$ and $a_2=0$.

Polar coding also suggests that what we expect is in fact true: Lemma 11 in this paper states that, for almost every binary expansion $a^\infty$ there is exactly one fixed point of $p_{a^\infty}$ in $(0,1)$. This "almost every" unfortunately does not tell us a little thing about binary expansions of rational numbers, since in this context, rational numbers are almost nothing. Hence, if we were able to show above property for $p_{a^k}$, we could improve our knowledge about the behavior of polar codes, or of the polarization effect in general.

Let me know if you want to work on this -- it is a nice problem, and I would love to see a definite answer to it.