DPCM, ADPCM, u-Law, LPC, compression and redundancy

Detlef Morgenstern detlef_morgenstern at yahoo.de
Fri May 18 02:11:36 PDT 2001


Hello Max,

--- Max <max at little.tele2.co.uk> wrote:
For example, given the string:
> 
> 000011110000111100001111
> this can obviously be compressed as
> ...

I cannot say it 'more compressed' as in
http://sun.sanna.com/casc/archive/frm00171.html:
"-- On Essence --
Something has been brought to its essence, when it cannot be
compressed any further - still doing the same.
Essence is about acting, not about being.
It is about producing patterns, not being patterns.

-- On Inductive Compression -- 
If we want to 'mechanically' distill the essence of something, we
must compress it. But there is a strict constraint for this
compression: After completing compression, the compressed instance of
our something 'must work as before'. Which means: When it is
confronted with a given set of patterns, it produces the same
patterns as it would have done before compression."

We are talking about computing here. Let us assume therefore that
your
000011110000111100001111
is not simply 'data' but a definition of activity ('program').

Some questions arise.

(1) What does this string 'do'?
(2) Can we find another - and shorter! - string (='program') which
does the same? There seems to be no use compressing the string as
such as long it is not known who will interpret this string how?
Which 'UCM microprogram' processes the string?
(3) Of all the possible UCM microprograms - can we find a
self-explaining UCM microprogram which is as Simple&Powerful as
possible?

Concerning (3), I suggest this one:
Each bit in 
000011110000111100001111
represents the state of a line at a given moment of time. These lines
might be connected to sensors (they help 'seeing' something) or
actors (they help 'doing' something). Understand it as some snapshot,
observation evidence, sample.

All what we know from
000011110000111100001111
is that "there was a case that these bits were associated:
000011110000111100001111"

Is this a rule? A program which can be compressed? I think, no. We
run the risk of over-generalizing if we deduce like that:

"Once we have got some comfort from 
000011110000111100001111
and now could observe that
00001111000011....001111
we should act like
..............1100......"

It becomes interesting only after subsequent observation:
000011110000111100001111
010010100101010100101010
111101001101010110000010
101010101010001001010110
...

Is it 'white noise'? Do all possible associations occur?
000000000000000000000000
000000000000000000000001
000000000000000000000010
...
000011110000111100001111
...
111111111111111111111110
111111111111111111111111

Or can we observe some repetition of patterns, and do some never
occur?

Having collected a large number of samples (induction!), we can dare
compressing our knowledge on which patterns seem to be preferrably
associated to which other ones.

If we can write that down with less bits than the observation table
occupies, we have a more abstract rule (program). 'Abstracting as
Compression'.

The compressing activity which you suggest will - in the context I
intentionally synthesized here - only 'pack' the observed association
for an easier way of archiving it. If you want to 'apply' it next
time, you will be forced to unpack it first. You cannot 'run' the
zipped version of the program.

By the way, the above uncompressed table itself is a 'program', too.
It lets us 'look up' results - which could be done by a 'lookup'
machine. (This is my deviant CasC viewpoint: 'Compute without
compression if you prefer that, but build (better) programs applying
compression.')

If you dig in earlier CasC postings, you will find some link to
Sergio Navega's homepage and an interesting news discussion archive
on HLUTs (Huge LookUp Tables, I think). This illustrates a lot of
which I want to say here. But what that discussion on HLUTs misses:

You must compress them!!

How?
Cascade smaller LUTs to form a larger one.

This will be impossible if our observation is 'white noise' - when
all possible associations occur. Then there is no regularity (rule,
essence).

But this will be possible if, regularly, some samples are missing.
When there are empty regions in the LUT.

If you can compress one HLUT to a cascade of smaller LUTs, you
distill the essence of it. Obviously, this procedure is recursive.

This has become a bit lengthy now. But I hope I could show this:

(a) Zipping data does not lead to a program.
(b) Zipping a program does not lead to a more abstract program.
(c) Hence, there is no use trying to compress an isolated string once
we speak about computing. Do you agree that CasC people should see a
string as a 'program'? OK, it is written as 'data' on the tape of a
Turing Machine. But it means 'activity'. Each digit in a calculation
is some 'piece of instruction'. If it is '1' it triggers a different
**activity** than '0'.

(d) It seems to be a nice task to describe a UCM which is
'super-universal' in a sense that it can compress LUTs like
000011110000111100001111
010010100101010100101010
111101001101010110000010
101010101010001001010110
...
knowing nothing more of their 'meaning' than
-- a column represents a source or drain of information
-- a row represents an observed association.
Two different implementations of this sUCM must come to 'functionally
identical' results (identically functioning programs), not knowing
each other's microprogram. They can find the essence (rule, law) of
the table knowing nothing else than the table itself. They synthesize
a multiplier from multiplication tables - not knowing what
multiplication is.

Thrilling.

Kind regards,

Detlef


__________________________________________________________________
Do You Yahoo!?
Gesendet von Yahoo! Mail - http://mail.yahoo.de



More information about the Casc mailing list