Regularity

Detlef Morgenstern detlef_morgenstern at alldata.de
Thu Jan 28 04:44:53 PST 1999


Gerry Wolff wrote:

(see:
http://www.wco.com/~sanna/casc/archive/msg00091.html
http://www.wco.com/~sanna/casc/archive/msg00092.html
http://www.wco.com/~sanna/casc/archive/msg00093.html)

> The term 'symbol' often means things like '5' or '+' which,
> in arithmetic, have a well-defined meaning which is stored
> elsewhere in the system (in the mechanisms which execute
> arithmetic statements) and are not displayed alongside each
> symbol. In this sense, these symbols have 'hidden' meaning.

The meaning of these symbols is, they are tags for a function specification
(you say 'mechanisms' and mean the same thing which I mean when I say
'functionality'). This specification (definition, description, algorithm) is
commonly available. It describes which causes will affect which effects. If
it seems hidden, then by the sole circumstance that it is not explicitly
explained each time it is applied. It is a reference to a commonly
accessible "function library", where function symbols are

> ... given meanings by writing them alongside there meanings in
> exactly the same way that a word in a dictionary is written
> alongside some other words which explain its meaning...

[originally, your SP directed statement]

Those 'other words' in a function description are references to other
functions, and so on, until (on the lowest resolution/granularity level)
they are references to built-in functionality (firmware) of the machine /
the consciousness (mental "firmware" functions).

> In this case, the meaning of an SP symbol appears alongside it
> and is not 'hidden' like the meaning of arithmetic symbols.

When SP grows larger, there will be a need to introduce "libraries" for
efficiently tagging successfully established compression results (shorthands
for commonly used rules, abstractions). This is the point a 'hidden' meaning
will emerge in SP, too.

There is no need to conserve a difference between the above "functional"
symbol and a CasC/SP symbol. If one feels a certain gap, then this is what I
have tried to bridge for all the last time. Let me try saying it in one more
variation:

Data (observations) can be considered being factual entities, scalar things
(assemblies, associations, aggregations). This is what you do when you say
'body of data'.

When I say 'functionality', I see data (observations) as snapshots of a
cause->effect activity, as causal entities (vector aggregations). The
'hidden' vector meaning is, I suggest associating a tag to each column of
the observation table saying either "this is part of the cause" or "this is
part of the effect" (and might be: "cannot define, at the moment, whether
this belongs to cause or to effect").

I hold that if this causality (vector, direction) is dropped, we massively
lose potential for inductive compression (hence: potential for inducing
rules), because most of the rules we are interested in say "what follows
from what".

I agree, my addition example was not the best one to show how I understand
inductive compression. Let me try demonstrating this by a compacter and more
obvious rule finding task.

[Best viewed in fixed font (Courier e.g.)/as plain text]

"Guess the rule in:
11010001
01111111
11111111
00011111
10001001
01010101
00111111
10101101
!"

And this is one possible (functional) solution:

1. Name the game. Tag the whole body by a symbol. Tag each observation
   column by a symbol:
Vector     (Symbol/tag for the body of "data": "CasC Vector Example")
ABCDEFGH   (observation tag, one letter per column/sensorial line)
--------
11010001
01111111
11111111
00011111
10001001
01010101
00111111
10101101

2. Tag (hypothetical) causes by 0 and (hypothetical) effects by 1.
   [The "Regularity (2)" article will show which are the criteria for
   isolating (hypothetical) causes.]
   Sort rows by causes for a better understanding:
Vector
00011111   (0=cause, 1=effect)
ABCDEFGH
--------
00011111
00111111
01010101
01111111
10001001
10101101
11010001
11111111

[To not make it massively puzzling, I intentionally did not permute
**columns** in the given body of data. So causes (A,B,C) come first here "by
mere chance".]

3.  Generate new/browse existing (library?) function patterns and match
   [a lot of overhead, must discard most of the hypotheses]. Search.
   These two are helpful:
->    AND  (function tag, symbol)  [-> : "Implication", "Conditional"]
001   001  (0=cause, 1=effect)     [AND: Logical And]
???   ???  ("variables")
---   ---
001   000
011   010
100   100
111   111

4.  "Unify":
->    ->    ->    AND   ->
001   001   001   001   001
ABD   BCE   ACF   DEG   GFH
---   ---   ---   ---   ---

Thus, the original observation evidence can be interpreted as following the
hypotheses in (4.), where "->" and "AND" are tags for the rules found in
(3.). The (inductive) compression result can be functionally" written down
as something like this:

Vector     ->    AND   ->    ->    ->    AND   ->
00011111   001   001   001   001   001   001   001
ABCDEFGH   ???   ???   ABD   BCE   ACF   DEG   GFH
--------   ---   ---   ---   ---   ---   ---   ---
000        001   000
001        011   010
010        100   100
011        111   111
100
101
110
111

[Please, translate this into "SPish". The "???" variable notation is
certainly not sufficient, because it does not reflect an "order of
arguments", which must be obeyed when (as in "->") the function is not
symmetric relatively to the arguments.]

The real world interpretation (which this "purpose free/context free"
example, of course, cannot find "from within") is:
"  It (H) is always true that from
        fact G, which means that
             fact D: "A implies B" is true AND
             fact E: "B implies C" is true
   follows
        fact F: "A implies C".   "

In other (formal logic) words, we are allowed to conclude that:
"If B follows from A and C follows from B, then C will follow from A".

The compressing effect is not significant here. This is due to the
simplicity of the example. It will be higher with larger "bodies of data".

This is a sketch of the task set up and a (possible) result by a still
hypothetical "inductive compression" procedure. But it might work. And -
back to compression - we will find significant assistance in compression.

> After we have done this for the finite number of rows that we have
> been shown, we see that the only function that works for every row
> is +. In short, + ***repeats*** on every row! Here is the repetition
> that we are discussing.

I feel, we converge. I agree to repetition if it is meant in this very
functional context. We can find functions (rules, regularity) in the world
only because we are confronted with repeated occurrence of traces of their
activity.

The clue is in which comparing procedure is installed in the rule finder. Is
it just comparing by a
   "Several-strings-contain-similar-substring"
criterion, or does it tend to be something like
   "Several-strings-describe-similar-functionality"?

Gerry, your background is language analysis, mine is electronical
design/automatic control. This reflects in our understanding of similarity
(regularity). I suggest making the comparing procedure (in SP??)
replaceable. So we can adaptively "update the firmware" when we find better
regularity criteria.

Best wishes,

Detlef




More information about the Casc mailing list