Re: Feature extraction from text


Subject: Re: Feature extraction from text
zirzop@my-deja.com
Date: Sat Jan 15 2000 - 05:19:17 MET


> > I am trying to build a neural-network based email classifier. What are
> > the features to be extracted from text documents and generate numerical
> > inputs to the neural networks?
> >
> > Could you give me pointers to any tutorials, papers, etc.?
>
> The approaches that come to my mind are all not-NN-based. References are:

Is there any specific reason for using Bayesian Classifier in this type
of applications?

Aren't NN methods good enough?

> W.W. Cohen, Learning Rules that Classify Email, AAAI Spring Symposium 1996.
> M. Sahami, S. Dumais, D. Heckerman and E. Horvitz, A Bayesian Approach
> to Filtering Junk E-Mail, AAAI-98 Workshop on Learning for Text
> Categorization (AAAI Technial Report WS-98-05).
>
> The second paper is particularly interesting, because it illustrates how
> domain-specific feautures (e.g., what's the sender's domain?) can
> improve performance in detecting junk mail.

Thanks a lot for references!

ZZ

---
[ comp.ai is moderated.  To submit, just post and be patient, or if ]
[ that fails mail your article to <comp-ai@moderators.isc.org>, and ]
[ ask your news administrator to fix the problems with your system. ]

############################################################################ This message was posted through the fuzzy mailing list. (1) To subscribe to this mailing list, send a message body of "SUB FUZZY-MAIL myFirstName mySurname" to listproc@dbai.tuwien.ac.at (2) To unsubscribe from this mailing list, send a message body of "UNSUB FUZZY-MAIL" or "UNSUB FUZZY-MAIL yoursubscription@email.address.com" to listproc@dbai.tuwien.ac.at (3) To reach the human who maintains the list, send mail to fuzzy-owner@dbai.tuwien.ac.at (4) WWW access and other information on Fuzzy Sets and Logic see http://www.dbai.tuwien.ac.at/ftp/mlowner/fuzzy-mail.info (5) WWW archive: http://www.dbai.tuwien.ac.at/marchives/fuzzy-mail/index.html



This archive was generated by hypermail 2b25 : Thu Apr 06 2000 - 15:59:39 MET DST