Fundamental Compressionist Philosophy.
Gerry Wolff
gerry at informatics.bangor.ac.uk
Fri Apr 27 09:56:45 PDT 2001
[ The following text is in the "iso-8859-1" character set. ]
[ Your display is set for the "US-ASCII" character set. ]
[ Some characters may be displayed incorrectly. ]
----- Original Message -----
From: "Andrew Stanworth" <andrew.stanworth at bigfoot.com>
To: <casc at sanna.com>
Sent: 26 April 2001 11:09
Subject: Fundamental Compressionist Philosophy.
> I would like to state, right from the off, that in my opinion this group
> has been named too reservedly, perhaps that is because of its academic
> origins? It should not be called "Computing as Compression" it should
> instead be be called "Computing IS Compression" but now it's too late.
> Maybe when people read CASC they could think CISC - a virtual rename ; )
>
OK, I'm guilty :-) on two counts: 1) being an academic, 2) coining the term
"Computing as Compression".
In defence of the latter, I chose 'as' rather than 'is' because so many
people, including the highly respected Alan Turing, have not seen computing
in these terms. There is usually more than way of looking at any topic so I
feel it is safer to use a title that acknowledges that there may be other
ways of viewing the same area of interest. This is "future proof" because it
means that other theories can be put forward as alternative views without
the need to 'destroy' older views. Quantum theory and relativity are
alternative views of the physical world that rub along quite happily, each
with their own strengths and weaknesses. Why not the same for alternative
theories of computing?
...
> To cut a long story short (and quickly rattling it off), any universal
> computer needs just two components, the first of which is simply a data
> entity which is capable of being chained together with others to
> form data 'patterns' (a fundamental data 'atom' is one which must be
> capable of two states since two states, in some form or other, are needed
> to make any sort of pattern) . The second element which is required is a
> switch able to dynamically link segments of one data pattern/chain to
> segments of any other, in a deterministic fashion, according to the value
> of an input (a fundamental switch 'atom' being able to take a two state
> input to select one of two segments to link to). In this way, complex
> outputs can be encoded as pathways through compressed data (Note: fixed
> data entites can be linked to switch inputs to preserve the fixed nature
> of data, whereas a variable data input source, whereever that may
> actually come from, is capable of producing new/novel patterns) .
>
> The way you link data chains to input switches determines the
> architecture of your computer, Turing, PC, Parallel, human
> knowledge/reasoning systems, whatever.
>
> The deep question to ask, is why any form of switch is required, since
> any finite output/data pattern could pre-exist as a fixed sequence (note
> that it is for this reason that it is the switch which actually does all
> of the 'computing'). The advantage of a switch, over the fixed data
> pattern is twofold. Firstly it allows data economy (the economical use of
> available resources) whereby, it becomes possible to compact data. Thus,
> for example, if two data patterns had the same pattern for their first
> half, but a divergent pattern thereafter, a switch could be used to bind
> them together which would preserve data output integrity after one of the
> duplicated sequences had been removed. Thus the switch itself should be
> regarded as the fundamental level of data compaction (as well as of
> algorithmic functioning - a.k.a. computing). The second advantage is that
> a switch allows the construction of non-finite output patterns, through
> loops and recursion which can be thought of as the compaction of
> non-finite sets within a finite environment.
>
...
Rather than using a 'switch', I have borrowed an idea that seems to lie at
the heart of most of the standard techniques for information compression:
any relatively long repeating pattern can be replaced by a relatively short
'identifier', 'code', 'name' or 'reference'.
The neat thing about using this traditional idea is that decompression can
be done in exactly the same way as the original process of compression: find
patterns that match each other and then merge or 'unify' the matching
patterns. In the original compression, it is the relatively long patterns
that are matched and unified. In the process of decompression, it is the
code patterns that are matched and unified.
This is most fully explained in my articles and reports about natural
language processing (http://www.sees.bangor.ac.uk/~gerry/NL_processing.htm)
Thanks to Andrew for interesting ideas and getting our discussions going
again.
Best wishes,
Gerry
More information about the Casc
mailing list