Biological Foundations?

Detlef Morgenstern detlef_morgenstern at alldata.de
Fri Sep 11 05:38:01 PDT 1998


Dear Sergio,

In our "Design a brain!" discussion you wrote:
> Exactly. But with a subtle difference: this "brain" could
> develop its own software, based on its interaction with
> the world.

I see two distinct levels of "software" which must be considered here.

(1)  The "application" level. This is a large library of laws (rules, causal
dependencies) we build up/maintain during our lifetime, "experimenting with
the world". We use these laws to model the world and predict what is going
to happen when certain conditions (causes, assumptions) hold. Feed a law
black box in your brain with "Cause_X1", and it will respond "Effect_X1". As
nature doesn't present explicit laws to our sensors, we must collect large
amounts of cause->effect associations we can observe. We put all the
cause->effect associations of a certain domain in a black box and label it
"LAW_X". We can already query it in this format! But: The fittest to survive
are the animals with best "prediction performance". This is why evolution
"put much effort" in an efficient "law query technology". Two optimization
goals suggest making a "law black box" as compact as possible:
(a) Limited storage capacity.
(b) The fewer "gates" we must switch, the faster we are.
This is why brain "runs a compacting daemon", which replaces de-compressed
(raw, redundant) law black boxes by their compressed equivalents
("functionally" identical stimulus->response behaviour). This compacting
daemon is:

(2)  The "firmware" level. It defines one basic operating principle of the
brain - being able to abstract, applying some compression-like technique.
This capability is "built-in". When a child is born, the child is "equipped"
with it. We need not "learn" it (and I think we cannot learn it). We inherit
it.

I agree, it is misleading to simply say "hardware", when I mean this (2)
"inherited basic operating principle of hardware". And of course, you can
"run" (simulate, emulate) this "firmware" principle in any other universal
computing environment - be it hardware centered  or software centered. I
wanted to point out, we must not forget about these two completely different
operation levels of brain and that a lot of AI frustration comes from
separating one from the other and from attempting to explain intelligence
being EITHER (1) OR (2). (Or - even worse -  (3) ONLY, see below.)

I repeatedly used the term "law black box" for two reasons
(a) Because public opinion seems to be
    "It is a law only after it is compressed." (abstracted)
    I hold a different view: The "abstraction ratio" for a law
    can vary, but it is always a law.
(b) To emphasize, it is some active entity - not a sort of passive
    database which must be "browsed" by "The Mind". It is an
    operational entity instead, with a stimulus->response behaviour.

A third software category could be considered:
(3)  "Operating system" level, or "tool" level. These are techniques which
are not inherited but learnt - they are a subset of (1). They improve
overall abstraction/prediction performance. Language: speed up learning by
communicating abstract knowledge. Conscious thinking: inspect, verify,
re-arrange, optimize the abstract law library. Apply abstract laws to other
domains (analogy). Consciously deduce. (Others).

> The problem is how to make this architecture intelligent
> for the first time. This is the great problem, that defies us
> for more than 4 decades.

(The built-in, inherited level, see above.)

> Complexity of the brain is, IMHO, an effect of several
> relatively simple modules operating concurrently.
> The big question seems how to join these modules and
> what each one is responsible for.

I think, the abstracting capability is not concentrated in "a couple of
modules which scan and compact our knowledge database", it is a completely
distributed feature at the "logical gate" level.

> That's why I think neuroscience is a worthy field of study
> for AI researchers, because we have a lot to learn from
> the organization of our own brain.

I agree, as long as we do not restrict it to "intelligence can reside in
neurons only".

> I don't see the importance of the hardware for AI.
> The main problem is to get to the principles of
> operation of intelligence.

(see above)

> Induction is one of the most criticized philosophical
> aspects of human reasoning. Karl Popper is known
> to be a fierce critic of induction. It is not difficult to perceive
> the harms of induction: taken without care it leads to
> fantasies and mysticism. But this "bad" side of induction
> is not enough to obscure the advantages of it.

We are in command of two types of induction: As the subconscious compacting
firmware feature, and as a "conscious discipline of explicit reasoning".

> The main problem is how to develop inductive machines
> that can recognize when to stop.

I see no problem at all. Induction is a transformation procedure which aims
at increasing abstraction ratio of a "lawbase" (not a database!). Each step
of transformation can be verified checking whether the stimulus->response
functionality of the "lawbase" is still identical to what it was before the
current transformation step. Abstraction ratio can be (reciprocally)
measured in "storage resource usage" units. And induction can be stopped,
when abstraction ratio reached a maximum (when we cannot find further
transformation steps, which increase abstraction not losing functional
integrity).

The bad side of induction casts its shadow on us when we apply it on the
conscious level, not obeying these "instructions for use". And as we tend to
be weak on the conscious level, it becomes a dangerous tool. Not induction
is to blame, but the weak human, applying it without caution and
self-discipline.

> The basic principle behind induction is that everything
> that happened several times in the past will have a good
> chance of happening in the future.

I disagree. In many cases, we induce a law, having sensed *different*
cause->effect associations, which did not repeat at all. We want to find out
*what is common* to them even if they look very different. This, I think, is
the powerful side of induction. Limiting it to repeated (in space and time)
occurrence of similar sensations, we throttle its power nearly to the idle.
This is, why
(a) All known implementations of pattern recognition perform so poorly
(b) Compression which only seeks similarities does not find abstractions
    but leads to "binary soup".

Is there some regularity in these observations?:
0110001001000
1101101111000
Is it just "000" or "110" or "11" occurring repeatedly?

This was one reason for me to ask 'What is regularity? How does a
"regularity detector" work?' Is it simply "repeated occurrence of similar
instances of something" or is it something more global?

You might object, only after having seen one and the same cause->effect
association several times, we may be sure "there is really some law behind
it". Disagree again. If it occurred at least once, there was a law behind
it. What we forget is, there was some context in which we observed the
event. And this context provides part of the law. If we do not sense the
context we cannot find "the law". Doing things repeatedly (in different
context!) or demanding that similar events must be observed repeatedly is
just to make sure we did not ignore a hidden context. It is simply for
feeling more comfortable about our observation & prediction precision. But
it has nothing to do with the precision of induction, which - if correctly
applied - will be 100%. You say it a bit different, but aim at the same:

> Then, this allows us to keep expectancies about the future.
> When those expectancies doesn't fulfill, this may inform us of
> something important. This deviation may mean that we need
> to take into consideration additional factors in our prediction,
> which will lead us into revising our original induction.

> When this process is applied for some time, we will end up
> with a lot of "rules" that hold with great probability. These rules
> are the building blocks of our theories and these rules are best
> manipulated using deductive reasoning. For me, deductive
> reasoning can only be applied after initial application of induction.

Mostly agree, but in my view, we can apply "entry level deduction" even
after having seen only lots of cause->effect associations, still not having
compacted (induced from) them. Once we "know" the (flat) cause->effect
association, and we see the cause, we can deduce this very effect. Of
course, this is deduction in idle gear. It will develop its power only on
induced knowledge, helping predict even effects for causes never observed
before.

This is the answer to the above regularity question. The law is:
0110  0010  01000     // 06+02=08
1101  1011  11000     // 13+11=24


And here we are at the actual inductive danger. We are tempted to apply our
(induced) law to hypothetical causes never observed. The law will predict an
effect, because there is some algorithmic power in it to bridge the gap
between discrete observation samples. To be correct, we ought to label the
predicted effect "Danger: Never observed before!". If we later find out, our
prediction was wrong, we have at least a counter-example now. This can be
added to the plain "lawbase", on which induction must be "re-run".

(By the way, what means "wrong"? What is "irregularity"?)

In a nutshell, induction as such is a 100% correct method, but it can lead
to deadly dangerous results, when carelessly applied
(a) Dropping observation context (too few sensory "lines")
(b) Over-generalization (too few different observation samples).
Twice "too few" - intuitively we try to cure that, observing repeatedly.

> AI started wrong because it focused directly on the
> deductive aspect.

It is the easier part. Let us not blame people going from the easier to more
difficult things. If we are aware, we cannot induce applying deduction
techniques exclusively, we are on the right track.

> The "logicist" approach (Newell, Simon, McCarthy) proposed
> that intelligence was just the result of logic manipulations.

We should be able to describe intelligence in terms of logic manipulations.
Foundations of intelligence cannot be illogical, or logic-free. When we aim
at finding a "basic principle", we need not look outside logic. I think we
must completely move the focus. Instead of permanently adding complexity to
logic on the higher (complex) levels, we must revise its firmware,
introducing some new "particle" symmetry.

"Hardware" again...

Regards,

Detlef Morgenstern
(Dresden 980911, mailto://detlef_morgenstern@alldata.de
Computing as Compression Mail List: http://www.wco.com/~sanna/casc/ )





More information about the Casc mailing list