BISC: Causality

Subject: BISC: Causality
From: Masoud Nikravesh (
Date: Fri Dec 08 2000 - 09:15:52 MET

Berkeley Initiative in Soft Computing (BISC)

Causality is certainly a central concept in many branches of science and
In a way, it is like "consciousness" -- a word with many meanings and
facets, each important,
many of which I have explored rather extensively. Because it is so complex,
I cannot
reply so coherently as Lotfi, but perhaps a few fuzzy thoughts from
different viewpoints might be of interest...
and please forgive the groping nature of my response...

A quick summary of my response:

1. There are a variety of concepts of "causality," some of which are
extremely precise. In discussing
these concepts with a broad scientific audience, it is important to keep
the multiple concepts in mind,
just as one would when discussing "consciousness."

2. The everyday way of TALKING about causality, when we say (for example)
"the Great Depression
caused the Nazi takeover of Germany," involves a style of reasoning which
can only be represented in terms of fuzzy logic.
This is certainly pervasive in the reasoning of humans today.

3. Progress in science and social science generally requires a migration
from that everyday style of thinking,
to a style of thinking which uses other concepts of causality, which are
easily and naturally represented
in terms of Bayesian concepts. These also deserve respect.

4. Progress in our thinking also requires appreciation of the fact that we
always must maintain a balance between
what we really know well (and can formulate precisely) and what we have not
yet fully digested (where everyday
types of reasoning are unavoidable to some extent). Artificial systems for
reasoning face the same difficulty as
we do in this respect.

After that... below is the groping or mental thrashing which led me to this

Best of luck,

    Paul W.

> ________________
> Causality occupies a position of centrality in human cognition. In
> particular, it plays an essential role in human decision-making by
> providing a basis for choosing that action which is likely to lead to a

Certainly I agree with this. In fact, a kind of imputation of causality is
important to
cognition even at levels of intelligence below the human level.

In fact, when Steve Grossberg complains about simple reinforcement learning
of animal psychology... he always stresses that there are TWO kinds of
which dominate the literature. There are experiments based on reward and
("operant conditioning" or "Skinner stuff"), where the simple reinforcement
lkearning models do very well.
But then there are experiments which demonstrate the existence and
prominence of an EXPECTATIONS system.
("Classical conditioning" or Pavlov stuff.)

When we try to model or replicate brain-like intelligence, it is crucial
that we have a subsystem which implements
EXPECTATIONS... that, in turn, requires a system which learns cause and
effect relations, which are the basis of
our expectations.

Actually, in 1987 (Jan/Feb), I published a paper in IEEE Transactions on
Systems, Man and Cybernetics which
reiterated my strawman model of intelligence in the brain, using a kind of
model-based reinforcement learning design.
There were three main subsystems of that model. In modern language, we
would call those "Critic", "Model"
and "Action" networks. But instead of "Model", I called it "Cause and
effect subsystem."
(I also described the link between dynamic programming and reinforcement
learning, and the critic training method which
was called "generalized TD" in Sutton's 1988 paper ... and I described
problems with that method, and a proposed more
powerful approach. Sutton invited me up to GTE to discuss the paper later
in 1987.)

So: yes, the issue of causality and how to learn cause-and-effect relations
is central to any effort to understand or replicate
brain-like intelligence.

Likewise, the issue of causality in forwards time and backwards time is of
central importance to the foundation of physics...

> desired result. There is an enormous literature on causality, spanning
> philosophy, psychology, law, statistics, system theory, decision
> analysis and physics, among others. And yet, there does not exist a
> definition of causality which satisfies the following criteria. My
> contention is that any attempt to define causality within the conceptual
> structure of classical logic and probability theory has no chance of
> success.

What is success? Both in neural networks and in physics, it seems enough to
add things to
probability theory or Bayesian assumptions; it is not necessary to violate
them, so far
as I can tell.

> 1. The definition is general in the sense that it is not restricted to a
> narrow class of systems or phenomena.

As with "consciousness," we may be able to live with multiple definitions,
so long as
we find ways to unify mathematics and understand relations across the
various areas.

> 2. The definition is precise and unambiguous in the sense that it can be
> used as a basis for logical reasoning and/or computation.

Is this necessary?

MUST the definition of a word be precise in order to be useful?

Think about this!

Can the word "causality" be useful even if its definition is fuzzy and

Actually... as a practical matter... I have found that the word
"consciousness" can be very destructive...
the word is almost like nitroglycerin, something which must be handled very
very carefully, because the
multiple definitions cause great confusion if it is not handled very very
carefully... the ambiguity has
caused great harm in some cases... but it has also caused great good, in
other cases...

I have not seen the same degree of practical problem with the word
"causality." At least, I think I haven't.
There have been serious problems at times in physics, where "causality" is
sometimes treated as a synonym for
"time-forwards causality," and it is very important to have multiple
concepts in mind...

But perhaps causality in AI (i.e. artificial symbolic reasoning in the
broadest sense) is a different matter.
There, we do not even have an existence proof that there is global "right

> 3. The definition is operational in the sense that given two events A
> and B, the definition can be employed to answer the questions: a) Did or
> does or will A cause B or vice-versa? b) If there is a causal connection
> between A and B, what is its strength?

This reminds me that there are inherent limits in what can be done with
words in natural language
anyway. Or perhaps... natural language gives us a choice between
propositions and definitions
which .. vary greatly in respect to how meaningful they are and how precise
they are.

With regard to "a" -- even Aristotle pointed out that there are some
intrinsic problems in trying to arrive at ONE
crisp yes-or-no answer to such questions. The problem is not just a problem
of DEGREE of "yes" or "no."
The problem is that the concept is multidimensional... maybe you could call
it a multimodal membership function.
In verbal, Aristotelian terms, there are forms of causality like "efficient
cause" and "proximate cause" and so on...
I forget the details.

In quantitative terms... I am reminded of reasoning about causes in
sociology. Typically, there are multiple causes for everything,
and the more meaningful questions revolve about... what is your model of
how the system works... and how do you reconstruct
the key variables and the model as it applies to the specific history under
study. More powerful and clear thinking
entails a steady, careful migration of thought from vague ideas about "A
causes B" to more comprehensive ideas
based on more global models of the system under study... (Where precise
thinkers get into trouble is
when they try to force a global model prematurely, and thereby throw out
the empirical data and the
preliminary fuzzy integration of that data. But in science, preliminary
global models can be a useful part
of the overall enterprise...)

Given a model (stochastic or fuzzy or whatever).... the connection between
A at time t and B at time t+1 (the usual
form of causality in the mundane macroscopic world) ... can be analyzed in
many ways. For example...
Freud discussed learning the strength of the relation from A to B, as a
core part of his model
of psychodynamics. (See Pribram's book The Freud Project Reassessed, or the
criticism of psychiatry/psychology
in the book of Barrett and Yankelovich.) The model-based reinforcement
learning model I developed was
really just a translation of Freud's theory into mathematics. In essence,
the strength of the connection from A to B
was operationalized as the derivative of B(t+1) with respect to A(t), in
the cause-and-effect network.

You may ask: "the derivative... or the expectation value of the
derivative?" In fact, in a complex reinforcement learning system,
the proper management of uncertainty is somewhat tricky. Expectation values
are sometimes calculated at the level of
predicting X(t+1) as a function of X(t), but not always. (See the overview
in chapter 13 of the Handbook of Intelligent Control,
David White and Sofge, eds, or the more technical discussion at
In this sense...a real understanding requires that we use DIFFERENT metrics
of A-causes-B in different parts of the

> 4. The definition is consonant with our intuitive perception of
> causality and does not lead to counterintuitive conclusions.
> There are many well-known sources of difficulty in defining or
> establishing causality. Among them there are three that stand out in
> importance.
> 1 Chaining. In this case, we have a temporal chain of events, A1,
> A2,..., An-1, An, which terminates on An. To what degree, if any, does
> Ai(i=1,...,n-1,) cause An.? Example.. I am called by a friend. He needs
> my help and asks me to rush to his home. I jump into my car and drive as
> fast as I can. At an intersection, I am hit by another car. I am killed.
> Who caused my death? My friend; I; or the driver of the car that hit me.

This is very similar to the examples from Aristotle and from sociology.

Perhaps we need to be able to talk about all three kinds of causality...
as per Aristotle... instead of trying to arrive at a precise and unique
of "what is THE unique cause"... we need to migrate to a different way of
whenever we can.

But: actually, if you are hit by a car, perhaps the lawyers would say this
is not
just an academic exercise, and they need to identify who goes to jail and
who doesn't.
That's another type of decision. I am glad I am not responsible for that
kind of decision.

Welll.. maybe I am, in a way. Instead of deciding who goes to jail, I am
involved in deciding
who gets funding. And there, we have a way to operationalize the criteria.
We always ask
about derivatives, in effect. We ask "if you fund X, versus if you do not,
what is the expected
net impact -- benefit -- on the sum of the two value criteria we discuss
(benefit to science and
benefit to humanity in general) in some detail...." There is some inherent
fuzziness in answering
the right questions here, but that is a lot better than places which base
their funding on precise answers to
irrelevant questions.

In any case: the problems you cite are INHERENT problems in the simplified
form of reasoning
which uses the everyday version of causality, where "one thing (A) causes B".
I do not believe that probability theory OR fuzzy logic or anything else can
eliminate the inherent limitations of this simplified/approximate form of
But certainly, you could argue that UNDERSTANDING the everyday A-causes-B way
way of thinking... requires that we understand and accept the fact that
everyday reasoning
involves a lot of fuzzy thinking. And sometimes even fuzzy math.


By the way, this leads to an interesting question: if subsymbolic thought, as in the "Model" network of a mouse brain, is much more elegant and mathematically defensible than the everyday reasoning of human beings, why haven't the mice taken over all our houses yet?

In essence, the problem is that it is easier to be integrated and coherent when dealing with smaller intervals in time and space. We retain the capabilities of the mouse... and add some other capabilities.. which are useful in managing more extended information. We do not manage that extended information as well as we can manage the "old stuff" which a mouse can handle well... and even a mouse is a big step beyond that strawman model I proposed in 1987... but imperfect management of that information is better than ignoring it altogether. As we evolve... there is hope that we can manage that extended information better than we normally do today. -------------------------------------------------------------------- If you ever want to remove yourself from this mailing list, you can send mail to <Majordomo@EECS.Berkeley.EDU> with the following command in the body of your email message: unsubscribe bisc-group or from another account, unsubscribe bisc-group <your_email_adress>

############################################################################ This message was posted through the fuzzy mailing list. (1) To subscribe to this mailing list, send a message body of "SUB FUZZY-MAIL myFirstName mySurname" to (2) To unsubscribe from this mailing list, send a message body of "UNSUB FUZZY-MAIL" or "UNSUB FUZZY-MAIL" to (3) To reach the human who maintains the list, send mail to (4) WWW access and other information on Fuzzy Sets and Logic see (5) WWW archive:

This archive was generated by hypermail 2b25 : Fri Dec 08 2000 - 09:17:31 MET