reduntex.txt in /html/abs Some notes concerning redundant (hence non-orthogonal) expansions Hans G. Feichtinger started Dec. 16th, 1994 Physics has obtained a lot of insight into the behavior of matter by the observation that macroscopic (i.e. large) pieces of matter can be understood as a collection of smaller objects (such as the physical atoms), which have quite a lot of similarity among each other (but there are also many different ones). For example, the reaction of a body under a field of forces can be understood quite well as the superposition of the action of this field on the individual atoms. [ I do not elaborate on the scientific and philosophical relevance of the - should I say - invention: of the idea of a physical force, because nobody has ever seen a force by itself, but always only concluded indirectly that such a "thing" must exist, because why should a piece of matter otherwise speed up, or change direction, or just "fall" towards the center of the earth]. There is also the observation that in order to identify what is usually called the "center of mass" within a physical body (let us think of some ball, or cone, or cube) one can apply integral calculus, which is more elegant than summations over an extremely large sum of forces acting upon single atoms, whose position one does not know anyway. However, from the mathematical point of view (since Riemannian sums are actually convergent towards the Riemannian integral, for example) it what not be contradictive to think that the matter has some kind of continuity, and to idealize this piece of matter as a "continuous integral" over an infinite variety of infinitely small "generalized atoms" which neither have mass nor extension (but as a whole make up this piece of matter). When it comes to functions and operators (or linear functionals) similar ideas can be used: We can see a (for simplicity) continuous function f as the "list of its values" {f(x),x \in \R}, where for each point on the real line x we have some function value f(x). Although a function which takes only non-zero values over finitely many points (or even just a single one) does have integral zero, this (uncountable) collection of values HAS a non-zero integral (at least if f is positive and non-zero). However, the integral (or any other linear functional on a space of functions, or even a linear operator) might be understood in the same way as the sum of actions on some "larger" , "elementary building-blocks" which make up the function f altogether. For example, if f happens to be a polynomial function we have the "natural" expression for f(x) as a sum over the monomial functions x > x^p, with p = 0,1,2... k (the order of the polynomial f). In other cases trigonometric functions or other systems of classical functions might be more natural (for interpretation). However, one must say that in principle (from a logical point of view) there is no real argument for one or the other "representation" or "expansion" of a given function (signal in the engineering termology), it is rather a question of convenience (in practical terms), or of the underlying modelling, which makes one description more likely than another one. The questions, what the "true nature" of a given function is (in other word, whether it is "truly" a sum over this or another collection of building blocks) belongs rather to the field of philosophy and will not be discussed here. Let us come back to the idea of matter being a superposition of, now what should we say? True atoms (with "nothing in between"), or a space filling collection of atoms (like in a crystallographic packing), or a continuity of "phantomic" atoms? Each of them has its virtue, but in some cases it is even very reasonable to think of atoms not something FIXED in space, but moving around, with some probability to be at a certain position more or less frequently, with maybe a center for the locations of high probability. In that case there is also NO reason, that there are places in space (actually all of 3D space will have this property for a natural modelling) where SEVERAL atoms may be found, with some probabilty (the total probability over those atoms being of course not larger than one). What does this "physical" model tell us about the decomposition of functions. I would like to argue in the following way: Whereas orthogonal systems (once fixed) have all the beauty of some crystallographic lattice, it might be the case, that in order to expand a given signal a certain choice of a (complete) orthonormal system might be relevant, whereas in the case of the atomic decomposition (as we probably think of our physical world) one has less strict order, but still a unique description (we speak about the atoms which make up some piece of matter). In mathematical terms one might compare this with a so-called Riesz basis, which is a collection of linear independent atoms in a Banach space (complete normed space), which allows to represent any element in that linear space as a series ("infinite sum", i.e. the limit of finite partial sums) in a uniquely defined way. Observe, that such a series expansion is a much stronger requirement than just the approximation of a function f by finite linear combination of "atoms" (or building blocks). In the first case (series expansion) one is only allow to add more and more terms, and essentially (in the long run) has to take precautions that by just adding more and more terms one comes closer and closer to the function f (to be expanded), WITHOUT ever getting far away from it late in the process of summation. In the second case it is enough that for a pre-given degree of approximation \eps one can find some finite linear combination of atoms such that one can achieve \eps-approximation. However, it is allowed to take a completely NEW set of building blocks whenever one has to get to better approximation, and does NOT have to use the previously used atoms. In real life this difference can be explained by the difference in planning for an large building. Whereas in the second situation it means that in order to "enlarge" some big house one is allowed to tier it down and destruct it and just build a new one of appropriate size, the first situation corresponds more to the situation, maybe given to an architect, that (at least for some while) he should take precautions that the given house can be expanded more and more (WITHOUT destruction going on) to fulfill better and better the needs for more and more space (in the sense of office space, if this house might be a large office building). Picking up our main line of ideas we compare now the previously explained idea of "particles having some probability of being located here and there" with a somewhat "overlapping", or in mathematical terms, linear DEPENDENT family of atoms. In that case a given function may have MANY different representations (as a series), but one cannot say that any of those expansions is the "true" or the "real" one (if such an argument does not come from some modelling process). A point in case would be the description of a little model house built by a kid from LEGO "bricks", but everybody knows that there are MANY different ways of building the "same" house. We may take this analogy even further and argue in the following way: Thinking of LEGO bricks it is obvious that the way how a given "block-house" can be built, and how many "stones" we need will depend very much on the collection of available atoms. If the collection is large the advantage will be that for certain situations one needs only VERY FEW terms. If the house has the same window-front on all four sides one might just use four copies of the same side-wall plus a roof in order to complete the house. From the point of view of variety one then has to make a choice. Either one restricts the possibilities (everybody has experienced as a child that certain things cannot be built, using only a limited collection of LEGO bricks, one woule like to have more and different ones), or one enlarges the number and variety of possible bricks. However there the danger is immanent that the number of different types of blocks will increase quickly beyond any reasonable size, it is hard to keep track of them, and in a given situation it might be actually hard to find a good collection of bricks fitting "best" to a given plan, because with the number of building blocks the number of possible variations (in building the house) might increase and finding even only a "near optimal" choice among those many possibilities might become an intractible combinatorial optimization problem. So at the end (taking the average over a large number of completely different houses to be built) it might be really easier to just go back to the old system of using the small standard bricks, which have all more or less equal size, and using many of them, but in a straigforward way in order to raise the house according to the plans of our architect. Going back to our mathematical problem, we would like to match the given function (our plan) using a given set of "atoms" (building blocks). Then, in order to do this efficiently, one has to say what that means: Do we want to have an efficient way of choosing the atoms AND then the appropriate coefficents, are the atoms given and do we want to minimize the total (square) sum of the coefficients needed, do we hope that as many functions (e.g. those in a large subspace of our function/signal space) can be obtained with a rather samll number of non-zero terms? Is speed of the calculation the optimization criterion, will those coefficients and maybe the parameters of the corresponding atoms be stored somewhere, or transmitted. In that case the speed of synthesis (the availability of a fast synthesis algorithm) might be a matter of concern. There are many real world situations where really such differences occur. In one case one tries to transmit some image from a satellite down to earth. The computer on the satellite has limited capacity, but the information channel (connecting it with the station on the earth) might be even more limited. On the other hand the computers down on the earth are much more powerfull, so synthesis might be much less of a problem in this case. Conversely, in the case of digital broad-casting the task is also to bring high-level information (such as CD-quality sound) to the end-user. In that case one certainly will have to design the overall system in such a way that allows for cheap receivers (you may think of the hope for allowing a single signal processor card do the synthesis work, even for stereo signals received), whereas in this case the sender's side can be much more expensive. Indeed, there are only comparatively few (and well financed) radio stations, but millions of customers which might not like to pay a lot of money for very expensive receivers. Finally, in some other systems only the amount of information stored is really relevant (e.g. for image data base systems). There is a further aspect, that might go along with the choice of a concrete system of building blocks: interpretation of the given representation or the coefficients used. Here, for the case of image analysis or digital signal processing the terms "multiresolution" and time-frequency localization are used in many differnent ways. One speeks about coarse and fine details of some image, or low and high frequencies in a given sound signal (in some melody). Certainly, schemes of coefficients which have such interpretations (such as wavelet or Gabor- expansions) have advantages for the practical use over others, because in that case coefficients are "not just numbers" but carry extra information which may be very useful for the "user". ..redundancy... relevance for pattern recognition tasks..