# BISC: A challenge to data miners: the soccer problem

Subject: BISC: A challenge to data miners: the soccer problem
From: Michelle T. Lin (michlin@eecs.berkeley.edu)
Date: Tue Jul 18 2000 - 19:57:42 MET DST

*********************************************************************
Berkeley Initiative in Soft Computing (BISC)
*********************************************************************

To: BISC Group

A Challenge to Data Miners: The Soccer Problem
----------------------------------------------

Recently, I was watching the soccer match between France and
Portugal. I noticed that most of the time the ball was in the vicinity
of the goal of Portugal. This observation suggested the following
hypothesis, call it H.

For generality, let the opposing sides be labeled A and B. Let
r be the fraction of time the ball spends in the vicinity of the goal
of A. The hypothesis is that the closer the value of r is to 1, the
higher the probability that B will win.

To make the hypothesis more concrete, assume that r is
measured as follows. Let the playing field be partitional into zones
R1,...Rn, with R1 being nearest to the goal of A and Rn the farthest.
Let ri be the fraction of time the ball spends in Ri, i=1,..,n. Let
wi, i=1,..,n be weights ranging in magnitude from 0 to 1. Then
r=w1r1+...+wnrn.

Does there exist a choice of the Ri and this wi such that H is
true? This is the crux of the problem. The assumption is that we
analyze N games, with r computed at the end of each game. The result
for game j, j=1,..,N, will be W(j) (win), D(j) (draw) and L(j) (lose),
with r being r(j), j=1,..,N. These data, then, would serve as a basis
for testing H.

The soccer problem is an instance of a problem in data mining
in which a hypothesis, H, is (a) generated; (b) tested; and (c)
modified. In my view, it is a challenging problem because how to
choose and adjust the Ri and wi is not a simple matter.

Regards to all,

Lotfi

----------------------------------------------------------
Professor in the Graduate School and Director,
Berkeley Initiative in Soft Computing (BISC)
CS Division, Department of EECS
University of California
Berkeley, CA 94720-1776
Tel/office: (510) 642-4959 Fax/office: (510) 642-1712
Tel/home: (510) 526-2569 Fax/home: (510) 526-2433
----------------------------------------------------------

Michael Berthold: berthold@cs.berkeley.edu

--------------------------------------------------------------------
If you ever want to remove yourself from this mailing list,
you can send mail to <Majordomo@EECS.Berkeley.EDU> with the following
command in the body of your email message:
unsubscribe bisc-group
or from another account,

############################################################################
This message was posted through the fuzzy mailing list.
(1) To subscribe to this mailing list, send a message body of
"SUB FUZZY-MAIL myFirstName mySurname" to listproc@dbai.tuwien.ac.at
(2) To unsubscribe from this mailing list, send a message body of
"UNSUB FUZZY-MAIL" or "UNSUB FUZZY-MAIL yoursubscription@email.address.com"
to listproc@dbai.tuwien.ac.at
(3) To reach the human who maintains the list, send mail to
fuzzy-owner@dbai.tuwien.ac.at
(4) WWW access and other information on Fuzzy Sets and Logic see
http://www.dbai.tuwien.ac.at/ftp/mlowner/fuzzy-mail.info
(5) WWW archive: http://www.dbai.tuwien.ac.at/marchives/fuzzy-mail/index.html

This archive was generated by hypermail 2b25 : Tue Jul 18 2000 - 20:10:51 MET DST