Re: Fuzzy vs. Probability: Just give it to me in plain English

S. F. Thomas (
Mon, 13 May 1996 13:18:41 +0200

Franz Newland ( wrote:
(( cuts ))
: Whilst it has been said that fuzziness can describe
: ambiguity, it may be important to point out that the
: ambiguity is not in the data being described, but rather the
: label being attached to the data: If someone is 1.7m tall,
: the ambiguity arises from applying the label 'tall' or
: 'average' to that person. The datum itself has no such
: ambiguity.

True. But note that "1.7m" is itself a label, no different
in principle from "tall", the difference being only a matter
of precision... The true datum is conceptually a *point*
in a conceptual continuum of height values, which to be
described unambiguously would require, again conceptually,
an infinite number of decimal places. Practically, of
course, the label-set we use must be discrete, which when
applied to describing points in a continuum, inevitably
allows ambiguity, whether the labels be numeric, or non-

: The fuzzy membership functions for the labels
: 'tall' and 'average' allow the label ambiguity to be
: resolved.

The label ambiguity is never "resolved"; it may at best
be *characterized*. Perhaps that is what you meant.

Actually, the term "characteristic function" is a better
one than "membership function" in some ways. The latter
presupposes a notion of set, when what we are really
after, in the beginning, is the characterization of
*labels*. The label "tall" is tied to a universe of
discourse, that of the abstract attribute of height, and
may be explicated in terms of a characteristic function
over that abstract attribute space. The fuzzy set of
tall *men*, say, to which the fuzzy label gives rise, must
in addition invoke a population of some sort -- men,
in this case -- for which the term "set" is more naturally
appropriate. Thus, the term "membership function" may
confusingly be applied either to the fuzzy subset of the
underlying population (men, say, or buildings), or to some
fuzzy region of the underlying attribute space (height).
The term "characteristic function" (of a label) cannot
similarly be ambiguous. Be that as it may, the term
"membership function" is now entrenched, happily doing
conceptual double duty whenever required, referring at
times to attribute-space, at other times to object-
or population-space.

: In contrast, probability addresses the issue of
: 'uncertainty'.

Uncertainty of occurrence, yes. Label ambiguity is also
a form of uncertainty -- that of description/measurement.

: If we have three measurements of someone's
: height, as 1.5m, 1.7m and 1.62m say, we could generate a
: probable height from this data,

.. More precisely, a probability distribution as to whatever
measurement process yields such data...

: and a probability of one of
: the above linguistic labels being used in description of
: this data, despite the ambiguity of the actual data.

Not sure what you mean. A separate calibration experiment
could certainly be performed to determine label usage,
for exemplars of various height values.
If the label set is the non-numeric "tall", "short", etc.,
one may use a meter-stick to calibrate such terms. Clearly,
one needs a more precise instrument to calibrate measurement
reports of one that is less precise. Even measurement
reports such as "1.5m" have labelling uncertainty surrounding
them, either crisply, as inherent in adopting the usual
convention of approximating the uncertainty around such
a measurement report with the crisp interval [1.45,1.55]...
or fuzzily, as will quickly become
apparent if one attempted to map a characteristic function
for measurement reports of a device such as a meter-stick.
Thus the label "1.5m" could well refer to points outside
the interval [1.45,1.55], depending on accidental or systematic
errors in actual use, and there would be no sharp inclusion/
exclusion boundary, as in the crisp interval approximation.
In any event, *each* datum carries
with it uncertainty of measurement/description, even as
a *collection* of such measurements -- implicitly over some
sort of population, somehow defined -- will constitute
a statistical process subject to uncertainty of occurrence
from trial to trial.

: In the
: context above, the question answered by probability theory
: would most likely be 'what is the probability of a given
: group assigning the vague linguistic label 'tall' to the
: datum 1.7m'. This presupposes the datum, and finds a degree
: of application of the label to the datum.

Yes, giving a workable operationalization of the notion
of "grade of membership".

: Fuzzy memberships of sets 'tall' and 'average' presuppose
: that the datum has elements of those labels in it, and,
: given that information, resolve the vagueness of those
: labels in the context of the datum.

Again, the vagueness is not resolved, merely characterized.
To resolve the vagueness, for attribute values drawn from
a continuum, one would have to use an infinite number of
decimal places. For discrete attributes, for example
that of cardinality, the labelling scheme is exact. It is
precisely the exactness of the numbering scheme "one", "two",
"three", etc. for the whole numbers, or the attribute of
cardinality, which forms the basis for bootstrapping a
conceptual basis for elaborating an exact labelling scheme
for points in a linear continuum -- the so-called "real"
numbers. But this exactness is conceptual, not practical,
because to distinguish any point in a continuum from
every other point, an infinite number of "decimal places"
is required.

(( cuts ))

: Consider the case of an image pixel where the pixel contains
: some emission from a region of land, and some emission from
: a region of sea. The pixel is clearly some average of these
: effects. A probabilistic approach to interpreting that pixel
: would attempt to find a likely label to assign to the pixel,
: thus making no presupposition about the label, only the
: data. The fuzzy set approach would be concerned with the
: situation where we knew that the pixel contained land and
: sea information, and wanted to have a degree of membership
: of each.

Hmmm. This suggests that there is some competition between
the two "approaches", which there isn't, I don't think.
Pixels are objects. They have attributes, eg. color,
brightness. I would imagine there is some way to measure
color, and brightness, given the ease with which both are
manipulated on computer screens. On the basis that every color may
be represented as a vector having dimensions red, green,
blue (I seem to remember these as being the primary colors),
each of which may be measured on a brightness scale
from 0 to 1, say, pixels could be measured/described
according to possession of a three-dimensional vector
attribute. This 3-dimensional space could be subject
to a fuzzy labelling scheme, eg. reddish, greenish, etc. also
dim, bright, etc. (Don't know about land and sea, though,
which it would seem to me to be less a property of a *single*
pixel than the *arrangement* of pixels in a neighborhood
of an image. On the other hand, I'm no expert in such things,
so I may be wrong.) Different points in the 3-space would
be describable by these fuzzy labels to differing degrees.
That's the measurement/description angle, which is the fuzzy
domain, as discussed before.

If now you consider a population of observers who use
language to describe pixels, and there is some imperfect
convention by which terms like greenish, blueish is
applied, then we have a statistical process corresponding
to the usage of these terms as applied to say, an arbitrary
pixel whose attributes (color mix, brightness) could be
manipulated (as in a cathode ray tube). Thus fuzzy usage
could be characterized by means of a statistical process.

I see a duality between fuzziness and probability, with
fuzziness related to probability in the same way that
likelihood is related to, but distinct from probability.
Implicit in any parametric probability model is a
likelihood function. The one is a function which has
as its domain the sample space, the parameter(s) being
taken as given; the other is a function which has as its
domain the parameter space, some occurrence in the sample
space being taken as given. In the same way, occurrence
uncertainty in the use of labels (yes/no sample space,
for a given label at a given point of attribute space)
gives rise to a semantic likelihood function over the
attribute space -- of hypothetical values for the attribute sought
to be described by the label. (Thus attribute space becomes
effectively parameter space in the context of the yes/no
usage model.) The semantic likelihood function is thus a
membership function, and vice versa. Fuzzy is therefore
nothing but likelihood revisited from a different angle,
with the same basic relation to probability as the notion
of likelihood, though with fresh semantics added.

As to the land/sea question, I suspect that that determination
must be made, not at the level of the individual pixel,
but at the level of the whole scene, or perhaps a patch of
scenery, because one needs not only color and brightness
to distinguish a land pixel from a sea pixel in an image,
but also the nature of the contrast with surrounding pixels.
In any case, I think the model/hypothesis dichotomy,
which is at the base of the probability/likelihood
and probability/fuzzy duality, still stands. Try to be
precise about your fundamental model, and the tools
needed to address the questions/hypotheses raised by
your model will suggest themselves.

Hope the foregoing is helpful.

: Regards,
: -----------------------------------------------------------
: Mr. F. Newland,
: ------------------------------------------------------------

S. F. Thomas