**Previous message:**Earl Cox: "Re: Is there anything wrong with fuzzy inference?"**Maybe in reply to:**Joe Pfeiffer: "Thomas' Fuzziness and Probability"**Next in thread:**Stephan Lehmke: "Re: Thomas' Fuzziness and Probability"**Messages sorted by:**[ date ] [ thread ] [ subject ] [ author ]

hrubin@odds.stat.purdue.edu (Herman Rubin) wrote in message news:<9l4kv0$2evu@odds.stat.purdue.edu>...

*> In article <66b61316.0108091708.7d6b9958@posting.google.com>,
*

*> S. F. Thomas <sfrthomas@yahoo.com> wrote:
*

*> >robert@localhost.localdomain (Robert Dodier) wrote in message
*

*> >news:<9kt895$rs$1@localhost.localdomain>...
*

*> >> In the interest of brevity, I've indulged in wanton snippage,
*

*> >> but I hope what's left yields something comprehensible.
*

*>
*

*> >> S. F. Thomas <sfrthomas@yahoo.com> wrote:
*

*>
*

*> >> > Robert Dodier wrote:
*

*>
*

*> ..............
*

*>
*

*> >Goodness, no. What I do argue however is that the semantics of
*

*> >likelihood do not just fall neatly out from the semantics of
*

*> >probability. Probability provides some of the underpinning, but not
*

*> >all. Otherwise Fisher would not have been led up a blind alley by
*

*> >asserting that the "likelihood of a or b is like the income of Peter
*

*> >or Paul, we don't know what it is until we know which is meant."
*

*>
*

*> I am by no means convinced that Fisher understood this, but
*

*> I can see no way that the likelihood of "a or b" makes any
*

*> sense at all.
*

That has precisely been the problem for all the generations of

statisticians since Fisher. I presume you to refer to the original

probability model from which likelihood derives, f(x;w) where x ranges

over sample space, and w ranges over parameter space and f is the

density function for the random variable in question. For any point

hypothesis w=a, f is clearly defined. But for a composite hypothesis

{a,b}, it is not clear how f is defined. Therefore -- and this was the

precise thrust of Fisher's metaphor -- we don't know what the

likelihood of "a OR b" is until, like the income of Peter or Paul, we

know which is meant. I presume that it is thinking along these or

similar lines that leads you to say that the likelihood of "a or b"

makes no sense at all. Or to say that the likelihood or "a OR b" is

the likelihood corresponding to the stronger element. Which is what

leads to a maximum rule for likelihood disjunction. Or, one uses the

probability metaphor as in the Bayesian set-up, rescales the

likelihood function to sum to unity, whereupon the likelihood of "a OR

b" becomes the sum of the two (rescaled) likelihoods, with appropriate

modification if the likelihood is construed as density function and

the integral calculus is applied. Like it or not, that is essentially

what Bayes does, although the story and the argumentation to get there

are very different, requiring ritualistic obeisance to priors of one

form or another, in particular "uninformative" if need be. Be all that

as it may, if you have an inferential method that purports to give a

direct characterization of uncertainty in model parameters, then you

are perforce computing likelihoods of sets or of composite hypotheses,

ie. you have a method for computing something like L(a OR b).

*> This
*

*> >leads to a likelihood calculus in which set evaluation is of the form
*

*>
*

*> > L( {a,b} ) = L(a OR b) = Max( L(a), L(b) )
*

*>
*

*> Are you taking a view of a linear truth value system?
*

Maybe... I don't know what you mean by "linear truth value system".

*>
*

*> AFAIK, this was first proposed by Lukasiewicz, and does
*

*> not work at all well.
*

*>
*

*> Likelihood is NOT probability, and "a OR b" does not
*

*> mean anything from the standpoint of likelihood.
*

But see above.

*>
*

*> >which rather quickly proves to be inadequate. Had it not been
*

*> >inadequate, I don't think classical statistics would have gone to all
*

*> >the trouble it has to develop indirect methods of describing the
*

*> >uncertainty in model parameters consequent upon sampling. Nor would
*

*> >there have been a neo-Bayesian revival intended to supplant the
*

*> >classicists precisely by offering a method of *direct*
*

*> >characterization. Indeed, Bayes offers a likelihood calculus in which
*

*>
*

*> > L(a OR b) ~ (L(a) + L(b))
*

*>
*

*> Bayes never offered anything about a likelihood calculus.
*

Nor did Savage, de Finetti and the others. My point was different. It

is that that, *in effect*, is what Bayesian inference is doing. The

whole song and dance about the prior just confuses this core issue, to

which it is easy to return simply by imagining a completely

"uninformative" prior (if such a thing is not a contradiction in

probabilistic terms), and seeing the posterior for what it then is,

ie. the likelihood function appropriately rescaled, and now

interpreted as probability or probability density.

*>
*

*> To Bayes, Fisher, Neyman, Laplace, Gauss, Kolmogorov, and
*

*> others, one can take the or of statements or the union of
*

*> events, but this is for probability. Likelihood is not
*

*> probability, although it is an equivalence class of formal
*

*> entities derived from probability.
*

I am most certainly under no confusion on that score.

*> >where ~ is to indicate that some normalization, appropriate to the
*

*> >construction of likelihood as a metaphorical (belief) probability, is
*

*> >necessary. It is only with the fuzzy set theory that semantics
*

*> >suggests itself
*

*>
*

*> > L(a OR b) = L(a explains the data OR b explains the data)
*

*>
*

*> >where "explains the data" is a fuzzy predicate no different in
*

*> >principle from "is tall", and subject to calibration in conceptually
*

*> >the same way. This leads, albeit with some reworking of the Zadehian
*

*> >fuzzy set theory along the way, to
*

*>
*

*> "Explains the data" is philosophical gobbledygook. Assuming
*

*> that we can assume that we have a binomial model, and we get
*

*> a positive number of successes and failures, ALL binomial
*

*> distributions with 0 < p < 1 "explain" the data; there is
*

*> a positive probability that the data could have come from
*

*> such a model.
*

But some explain the data better than others. As Fisher said, the

likelihood function supplies a "natural order of preference for the

possibilities under consideration". It is exactly analogous to the

notion of semantic likelihood (or membership function) for a term such

as "tall" providing a natural order of preference for what a competent

speaker of the language *could* mean when she uses the term tall to

characterize one's height. Therefore, analogously to some speaker

(witness) saying "the unknown attacker is tall", the result of

sampling from a probability distribution is the implicit assertion of

"the data" to the effect "the unknown probability model parameter is

[an explanation of the observed sample]", and the membership function

of the term in brackets may be identified with the (absolute)

likelihood function generated by the data under the model. Call that

philosophical gobbledygook if you like. All philosophical abstraction

is in the end metaphor. Some such abstractions never make it down to

ground I quite agree. But that is not the case here. What I propose is

quite computable. And the essential insight seems to me to be quite

plain, though I would readily admit that the semantics are unfamiliar.

*>
*

*> > L(a OR b) = L(a) + L(b) - L(a)*L(b)
*

*>
*

*> >where indeed the laws of probability are invoked, and at that in a
*

*> >very simple way, but it is the fuzzy set semantics, and the device of
*

*> >the calibrational proposition, that provides the essential frame that
*

*> >Fisher overlooked.
*

*>
*

*> The likelihood function can be multiplied by any constant,
*

*> and often is; L and c*L are the "same" likelihood function
*

*> for any statistical purpose.
*

Not if you are using the product-sum rule of disjunction. For that

purpose, one must distinguish the absolute likelihood function from

the *relative* likelihood, which I quite agree is unique only up to

similarity transformations, and to which you allude. THus for example,

if you are computing a marginal likelihood function, you would work

with the absolute likelihoods to accomplish the marginalization, and

only then may you rescale. In the theory I am concerned to develop I

in fact use the term membership or characteristic function for the

absolute likelihood, since I am essentially drawing on the insights

and semantics of the fuzzy set theory (reworked to admit the notion of

calibrational proposition with which this thread was begun) and the

term possibility distribution for the relative likelihood.

Why should anything be

*> independent, even if it can be considered probabilities?
*

*>
*

*> .................
*

I am not sure I get your point here in this context. But if it is what

I think it is, then the reworked fuzzy set theory continues to have

the min-max connectives in certain circumstances, in particular when

there are constraints of strong positive semantic consistency linking

the respective affirmation probabilities ... in such cases there is

clearly no independence. Likewise, where there is strong negative

semantic consistency (for example those affirming an exemplar to be

tall tending systematically to disaffirm him to be short), the

appropriate rules for the conjunction and disjunction connectives are

the bounded-sum (Lukasiewicz) rules. It is only when semantic

independence may be assumed that the product and product-sum rules are

appropriate. That would appear to be the appropriate assumption in the

case of statistical inference involving in a sense the interpretation

of what "data" say.

Regards,

S. F. Thomas

############################################################################

This message was posted through the fuzzy mailing list.

(1) To subscribe to this mailing list, send a message body of

"SUB FUZZY-MAIL myFirstName mySurname" to listproc@dbai.tuwien.ac.at

(2) To unsubscribe from this mailing list, send a message body of

"UNSUB FUZZY-MAIL" or "UNSUB FUZZY-MAIL yoursubscription@email.address.com"

to listproc@dbai.tuwien.ac.at

(3) To reach the human who maintains the list, send mail to

fuzzy-owner@dbai.tuwien.ac.at

(4) WWW access and other information on Fuzzy Sets and Logic see

http://www.dbai.tuwien.ac.at/ftp/mlowner/fuzzy-mail.info

(5) WWW archive: http://www.dbai.tuwien.ac.at/marchives/fuzzy-mail/index.html

**Next message:**Stephan Lehmke: "Re: Thomas' Fuzziness and Probability"**Previous message:**Earl Cox: "Re: Is there anything wrong with fuzzy inference?"**Maybe in reply to:**Joe Pfeiffer: "Thomas' Fuzziness and Probability"**Next in thread:**Stephan Lehmke: "Re: Thomas' Fuzziness and Probability"**Messages sorted by:**[ date ] [ thread ] [ subject ] [ author ]

*
This archive was generated by hypermail 2b30
: Mon Aug 13 2001 - 13:21:22 MET DST
*