Regularity

Gerry Wolff gerry at sees.bangor.ac.uk
Fri Jan 15 02:09:08 PST 1999


Dear Detlef,

I have put some more thoughts below the following quote from you (and
me). 

Regards,

Gerry

Detlef Morgenstern wrote:
> 
> Dear Gerry,
> 
> Thanks for your systematising work! After such convincing arguments it may
> seem stubborn if I will still insist on what I called "functional
> regularities". Let me try to highlight those turning points at which I tend
> to not follow your arguments.
> 
> You wrote:
> 
> > In these terms, a program (or 'function') which compresses some
> > data may be regarded as an expression of 'regularities' in the data.
> > This seems to correspond with the kind of 'functional' regularity
> > you are discussing.
> 
> No. The term "functional" does not aim at the "function" which compresses
> the data but at the fact, that we must compress entities in a way that their
> "function"ality (their functioning) is not affected: After successful
> (inductive) compression they function as before, they perform the same
> function (which expresses the regularity in them), but with less redundancy.
> They are better structured now.
> 
> A logical consequence of this view is that I will see our task not only in
> compressing a (passive) "body of data" but some active entities instead. It
> resembles compressing a machine while it is working. Compress it in such a
> manner that a user of this machine will see it is less iron now but does the
> same thing as before.
> 
> I think we will dramatically limit ourselves if we reduce the task to
> throwing out all the gearwheels of identical type leaving a tag saying
> "Insert gearwheel type X here".
> 
> > 1 If you look at computer programs (in almost any high level
> > language) with the right 'spectacles' it is possible to recognise
> > programming devices which are essentially the same as the
> > devices used in 'standard' techniques for data compression.
> > For example:
> 
> [examples for repetitive regularities follow]
> 
> It is absolutely correct, that today's programs are marvellous examples for
> repetitive regularities. But why? I think because we are biologically biased
> prioritising repetitive regularities in our understanding of the world, even
> more in our creations which are meant to model this world (see "Regularity
> (1)"). This status quo does not mean the world is like our programs, but our
> programs are like evolution adjusted our minds to everyday needs.
> 
> > * Michael Jackson said that a 'well-structured' program
> > is one that 'models' the structure of its input data and/or
> > its output. For the case we are considering (representing
> > a body of data in compressed form), there would be no input
> > and the output should be the given body of data in uncompressed
> > form.
> 
> This is the actual parting of the ways. You are considering "compressing a
> body of data", I want to compress "living entities". In terms of input and
> output: The input to an SP "compressor" is a raw body of data, it's output
> is the compressed data. In my view, inductive compression gets an input
> which is some redundant regularity specification (focused on the
> functionality of this regularity), and the output is an instance with the
> same functionality but built of less parts (less redundant).
> 
> "Repetitive" compression processes a body of data, whereas inductive
> compression processes a (function) specification.
> 
> There is some irony in this recursion: How would you best describe the
> functionality of something? Describe which output results from which input!
> I am working on a procedure which allows "inductively" compressing this
> specialised kind of data. It will take some more time, but I hope one fine
> day I can present some prototype. This causal activity of entities is one
> more "dimension" which must be tamed. But if you simply set it to zero, the
> whole active world collapses to dead substance.
> 
> > * It is not too hard to see that compression is greatest if the
> > 'chunks' which are identified in functions, loops or classes
> > are relatively frequent. For a given size, a named function
> > which is used only 2 or 3 times will yield less compression
> > than one which is used hundreds of times.
> 
> This is the "repetitive" view. I can imagine cases that there are few
> repetitions at all but inductive compression does a great job. I did not
> invest time to construct such a (counter-)example, but if you think this
> could illustrate my arguments, I might construct an example of
> "before-inductive-compression" compared to "after-inductive-compression".
> This can be verified even we if still do not have such an inductive
> compressor: We simply must prove that functionality of such an entity
> remained the same, while redundancy could be reduced.
> 
> > 4.1 That being the case, principles which apply to the
> > 'induction' or 'discovery' of grammars are likely to carry
> > over to the process of forming well-structured programs.
> 
> OK, but once we recognised that this understanding of well-structured
> programs is biased to repetitive regularities, a discovery of grammars
> following this scheme will inherit this property. This does not prove, that
> this will be the ultimate insight.
> 
> I could repeat ideas similar to things said above in reply to what you
> summarise then. I omit that.
> 
> The second section of your posting ("Discovering Addition") needs some
> deeper exploration. No comments today.
> 
> I the beginning you say something nice about
> > ... trying "walk before I try to run"...
> 
> If we want to fly - running faster and faster alone will not do...
> In my stubborn criticism something seems to have got lost: Thanks for the
> intense work you invest in finding the "rule of the rules"!
> 
> Best wishes,
> 
> Detlef

I suspect part of the apparent disagreement is to do with the meanings
which we are giving to words like 'data' and 'function'. I suspect that
we are not really disagreeing at all.

Let me use one or two concrete examples to test whether I have
understood your points correctly.

It is possible to devise a 'function' whose main purpose ('function' if
you like) is to compress some 'data'. Good examples, of course, are
programs like Zip or 'compress' (in Unix) or programs used to compress
video signals before they are transmitted. No doubt, the designers of
programs like these have ensured that the programs themselves do not
contain unnecessary redundancy.

It is also possible to devise a 'function' whose main purpose is perform
some 'function' in the real world. An example, perhaps, would be a
program inside a simple factory robot whose 'function' is to transport
widgets from location A to location B in the factory.

If I understand what you have said, you are suggesting that my approach
is neglecting this kind of real-world 'function'. Please correct me if I
am wrong.

Assuming that I have understood you correctly, I don't see why
'computing as compression' should not apply to the robot's program just
as it may do to compression algorithms. In simple terms, the program
inside the robot could be written something like this:

go to A; pick up widget; go to B; deposit widget; go to A; pick up
widget; go to B; deposit widget; go to A; pick up widget; go to B;
deposit widget; go to A; pick up widget; go to B; deposit widget; go to
A; pick up widget; go to B; deposit widget; go to A; pick up widget; go
to B; deposit widget;

and so on.

Of course, no one would ever write a program like this. The program
would be written in compressed form like this:

WHILE (there are still some widgets to be transported)
BEGIN
     go to A; pick up widget; go to B; deposit widget;
END

The general point is that 'data' and 'function' can both be expressed as
a stream of binary digits. So, in one sense of the word 'data', they are
both 'data'. This means that exactly the same principles may apply to
both. The program itself can be compressed and, if my working hypotheses
are correct, the process by which programs are 'executed' may be
understood as a process of compression.

Have I understood you? Or is this all completely off target?

Gerry



More information about the Casc mailing list