Regularity
Gerry Wolff
gerry at sees.bangor.ac.uk
Wed Jan 27 01:52:26 PST 1999
Detlef Morgenstern wrote:
>
> Dear Gerry,
>
> I see no real disagreement in our discussion. The point is, I feel unneeded
> limitations which result from your reducing regularity to repetitive
> occurrence of entities in space/time. My dilemma: For the time being, I
> cannot provide alternatives (on the algorithmic level) for including
> regularities which are far from being repetitive. Thus, my contributions
> tend to be missionary appeals:
> Widen the scope of regularity (functionality)!
>
> I completely agree with your robot example in
> http://www.wco.com/~sanna/casc/archive/msg00085.html
> and the compressed form ("function") you find for a set of actions ("data").
> But again, the example is biased in itself. It is taken from a repetitive
> world. The given sequence of actions of the robot is repetitive, and
> naturally, will the SP machinery - specialised in repetitive cases - find a
> compressed form which you have a legitimate claim to call "function".
>
> I beg your pardon for classifying this as a "special case" of regularity
> (function). I stick to my opinion that a lot of regularity can be found in
> worlds without obviously repeated occurrence of similar entities. I would
> even say that there is, statistically, much more regularity which is
> non-repetitive. Once I take this view, (repetitive) SP regularity will model
> only a tiny slice of the real world for me. Examples? Pick out an arbitrary
> numerical or logical function z=f(x,y). Generate an "observation" table (3
> columns x,y,z). There is little chance you will succeed reconstructing "the
> rule" f() from these "observations" if you will exclusively rely on the
> preferred by SP repetitive regularities.
Let us take a *very* simple function: x = y. A table of points for this
function would look something like this:
x y
1 1
2 2
3 3
4 4
5 5
etc
Where is the repetition in this? An obvious place that it repeats is
that each number in the x column is repeated in the same row in the y
column. Generalising this so that it applies to any two values means
arranging things so that the first value can be matched and unified with
the second value.
Well, this is just one simple example. What about z = x + y (closer to
your example)? The table would look something like this:
x y z
1 1 2
1 2 3
2 1 3
1 3 4
3 1 4
etc (very tedious!)
As you say, it is not obvious where the repetition is here. Given just
the table, how could the function z = x + y be inferred? Let's imagine
that, like a mathematician, the inference/learning system already knows
the number system. And let's imagine, just to make things simple for the
sake of discussion, that the system knows that we are looking for a
function z = f(x, y) and that f() is one of add, subtract, multiply or
divide.
Given this amount of help, it is not difficult to see how the function
could be inferred: try +, -, x and / for the first row and make a note
of which one fits (only + works here). Then do the same for the second
row. Ditto. Likewise, for the third and subsequent rows.
After we have done this for the finite number of rows that we have been
shown, we see that the only function that works for every row is +. In
short, + ***repeats*** on every row! Here is the repetition that we are
discussing.
What about the inference that this should be true for all other rows? At
this point, mathematicians use an inductive argument based on the
structure of the number system. The essence of the number system is that
it is recursive - which is another way of saying that it expresses the
***repetetive*** fact that, given any number, we can obtain the next
number by adding 1. So, again, to make the inductive leap, we use the
fact of repetition which is already built into the number system.
These are only examples but they seem to me to lend support to the idea
that all kinds of regularity do, ultimately, boil down to repetition.
Of course the examples do not prove the point. It is after all only a
***working hypothesis***. Basically, I am trying to see how far this
working hypothesis may take us. So far, it seems to take us quite a long
way!
>
> This picture gets the worse, the more nested levels of functionality
> f(g(...),h(...)) we encounter.
As I have said somewhere else, it seems to me that the main reason that
mathematicians and programmers create named functions (like g(...) and
h(...)) is so that they can be referenced from two or more contexts
without having to write them out in full each time. This is compression
based on repetition! 'Nested levels' in functions is sign that
compression techniques have been applied.
>
> > The general point is that 'data' and 'function' can both be
> > expressed as a stream of binary digits.
>
> This is the "general SP point". The moment one treats a function as "data"
> he does something similar to the following:
> - take a photo of something
> - scan/digitise the photo
> - compress the image file
> - expect the compressed image file helping you better understand the
> functionality of this "something".
>
> This will work only in few selected cases (with "functionality sensitive"
> types of compression). The crux is that otherwise in the course of the
> transformation performed, functionality (rule) inherent to the "something"
> gets lost. You already won't find it in the photo (unless you have a
> super-intelligent photo interpreter, who knows all about all).
>
> A rule finder tool will be "universally" applicable only if we succeed
> reflecting the functionality (cause to effect/input to output causality) of
> an entity in the "image" of the modelled entity. We need a "living model" of
> this entity on a level of a "working engine". And only compressing this
> working engine (instead of a photo of it) will guarantee we will make it
> simpler not losing its functionality (power). Of course, the functionality
> of the engine may be described by some data ("program"). As any data, this
> program can be reduced to streams of bits. BUT - if we want to compress
> these streams, we must consider special rules. These rules must make sure we
> do not hurt functionality of the modelled engine. Blind ZIPping of bit
> streams won't do. (Same counts for symbol aggregation levels higher than
> "bit".)
Compressing a photograph using standard 'quick and dirty' compression
algorithms will not tell you much about the photo. But compressing it
with something more sophisticated that picks out 'natural' structures,
may well tell us something useful about the things in the photo.
Of course you are right, that you can learn a lot more from having
additional information available such as a sequence of frames in a
cinematographic sequence and having information about sounds and other
non-visual information. In principle, this 'extensive' information can
be treated like the photo and compressed (using 'sophisticated'
compression techniques that can pick out 'natural' structures). Since
more information is available in the 'extensive' view, you are likely to
learn more about meaningful aspects of the data.
Biochemists use some of these kinds of ideas in trying to understand the
structure and function (and evolution) of DNA sequences or sequences of
amino acid residues (in proteins). Looking for repeating 'motifs' is a
standard method of analysising what is going on in these large
molecules.
Having said all this, I do have an open mind! It could well be that
there are kinds of regularity that don't fit into the framework I am
trying to construct. As I said above, the repetition idea is just a
working hypothesis. I will keep pushing it until it 'breaks'.
Bye for now.
Gerry
More information about the Casc
mailing list