Fuzzy c-means initialisation -- summary of replies [long].

Jonathan G Campbell (jg.campbell@ulst.ac.uk)
Wed, 29 Apr 1998 09:22:51 +0200 (MET DST)

On 8 April, I wrote:

" I'm implementing a fuzzy c-means clustering routine -- following p.360
of G.J. Klir and B. Yuan, 1995, Fuzzy Sets and Fuzzy Logic, Prentice
Hall.

Is there any agreed `best' method of initialisation of the memberships?
-- presumably best in the sense of faster convergence, I assume that
local optima is not a problem.

At present, I intend initialising as per my k-means clustering routine,
i.e. randomly allocate class labels; for fuzzy version, I'll simply
initialise memberships to {0.0, 1.0}. "

Many thanks for all the replies. Here is a summary of those that came by
e-mail; my news server was acting very strangely over the Easter period,
so I may have missed those that were posted.

--------- Larry Hall hall@csee.usf.edu -------------------------------

You can take the range of each feature and partition it into c intervals
(c cluster centers) and make the same value in each of the c intervals
the
centroid value of a cluster for that feature.

-------- Robert E. Dalton red@swl.msd.ray.com ------------------------

Some time ago, I think 1993-94, Ron Yager published
a note in Systems, Man and Cybernetics on something called
the "Mountain Method", which could be used to initialize
memberships in the c-means algorithm. Although Yager
suggested his method might speed convergence of c-means,
he didn't prove it or furnish experimental data supporting
his claim.

In my experience, using a crisp initialization that satisfies
the constraints works as well as any other technique. I'd
be VERY interested if you found something better.

--------- Amparo Vila <vila@decsai.ugr.es> ---------------------

Try the methods included in :
M. Delgado, A.F. Gomez Skarmeta and M.A. Vila
On the use of Hieralchical Clustering in Fuzzy Modelling.
Intenational Journal on Approximate Reasoning V. 16 (3,4) pp. 377-391
(1997)

------ Oscar Duarte <visit2@decsai.ugr.es> --------------------

Is better to use a random initialization strategy, constrained to sum 1.
i.e. something like this:
///////////////////
randomize();
for(i=0;i<NUMERODATOS;i++)
{
suma=0;
for(j=0;j<NUMEROGRUPOS-1;j++)
{
temp=(float)((float)rand()/(float)RAND_MAX);
U[j][i]=temp*(1.0-suma);
suma=suma+U[j][i];
}
U[NUMEROGRUPOS-1][i]=1.0-suma;
}
///////////////////////

you can find a simple C source code (1-dimension) at

http://ohm.ingsala.unal.edu.co/ogduarte (select "software" and then
"c-means")

------- Kuhu Pal <res9517@isical.ac.in> --------------

I am also working on fuzzy clustering. I am initializing the
membership values randomly. I do not know any method for initializing
the membership values.
-----------------------------------------------------------------------

Maybe the resulting function (in Java, for the DataLab-J environment) is
of some interest, see below. Regarding initialisation, init selects the
initialisation method: 1 for random crisp classes, 2 for Duarte's random
fuzzy (normalised) classes, 3 for Hall's feature space partitioning; 0
starts with whatever is currently in the memberships -- I use this in
connection with a routine that generates fuzzy classes from crisp
classes, e.g. as generated by (crisp) k-means clustering. Of course, any
errors are mine.

I have tested using the 100 data from the first two Anderson/Fisher iris
classes. Unfortunately, except for initialisation from k-means result, I
cannot report any substantial differences in speed of convergence, but
then these data are probably a poor test. If anyone can suggest a better
test, or has data for which they would like comparisons, please email
me.

A quick note regarding the DataLab-J(ava) environment. `Dld' is the
basic data object; a Dld contains a `primary data' block which, in
general, is a multi-band/variate/dimension `image'; a Dld contains three
additional but optional data blocks, each based on the same form as
`primary data': (1) crisp label data, (2) fuzzy label data -- one
dimension per class, (3) ancillary data -- e.g. continuous dependent
data, where we have a model: y = f(x), where x is the independent
variable (primary data), y -- the dependent variable, and f(.) some
function -- presumably to be estimated.

You will note that everything is an `image', i.e. all data are accessed
by the index pair r(ow), c(olumn). If the data are one dimensional (e.g.
digital signal), then there is simply just one row. Likewise set data
(with no sense of neighbourhood/index), we simply store it as one row --
and make sure that no signal/image processing involving neighbourhood is
used.

For more, access the paper: Jonathan Campbell and Fionn Murtagh, Signal
and Image Processing in Java at:
http://www.infm.ulst.ac.uk/research/preprints.html

That paper is a little out of date -- the fuzzy classes having been
added only recently, but the basic design/concepts haven't changed.

Best regards and many thanks.

Jon Campbell

P.S. If other's posted suggestions that weren't emailed directly to me,
could you possibly email them -- as I said, my news-server suffered a
crisis last week. j.c.

static final double BIG= 1.0e30;
static final double mpow=1.25;
static final double epsd=1.0e-30;
static final double recmpwm=1.0/(mpow-1.0);
static final int loopmax=100;

public static void fcmeancu(Dld source, int ncl, int init)
{
PrintStream out=System.out;
int nd= source.ndimsd();
int nr= source.nrows();
int nc= source.ncols();
float ns=(float)nr*nc;
int ic,d,r,c;
double memb[]= new double[ncl], membold[]= null, sum, min, step;
double m[][]= new double[ncl][nd];
long rseed= 11293;
Random rand = new Random(rseed);

if(source.ndimsf()!=ncl)source.addfla(ncl);

// initialise
if(init==1){ /* random classes, like k-means-clustering init.
*/
for(r=0; r< nr; r++){
for(c=0; c< nc; c++){
for(ic=0; ic< ncl; ic++)source.putf(0.0f,ic,r,c);
ic= Math.abs(rand.nextInt())%ncl;
source.putf(1.0f,ic,r,c); //ic is dimension for class chosen
}
}
m= source.fMeanDouble(mpow); // compute fuzzy class means/cluster
centres
}
else if(init==2){ /*random memberships: Oscar Duarte:
http://ohm.ingsala.unal.edu.co/ogduarte
*/
for(r=0; r< nr; r++){
for(c=0; c< nc; c++){
sum=0.0;
for(ic=0; ic< ncl; ic++){
memb[ic]=rand.nextDouble();
sum+=memb[ic];
}
for(ic=0; ic< ncl; ic++){
source.putf((float)(memb[ic]/sum),ic,r,c);
}
}
}
m= source.fMeanDouble(mpow);
}
else if(init==3){ /* cluster centres initialised by independent
feature range partitioning -- Larry Hall
*/

for(d=0;d<nd;d++){
min= (double)source.mind(d);
step=((double)source.maxd(d)-min)/(float)(ncl-1);
for(ic=0;ic<ncl;ic++)m[ic][d]= min + step*(double)(ic+1);
}
}
else{ /*assume classes already initialised, e.g. by copying
results of (crisp)k-means clustering
*/
m= source.fMeanDouble(mpow);
}

boolean done= false;
double eps= 0.01;

int loop=0;
double x[]= null;
while(!done){
double p= -1.0e30;
loop++;
out.println(" loop "+loop);
for(r=0; r< nr; r++){
for(c=0; c< nc; c++){
x= source.getpvecdDouble(r,c); // (primary) data vector
membold= source.getpvecfDouble(r,c); // membership vector
fnmcl(memb, m, ncl, x);
for(ic=0; ic<ncl; ic++){
source.putf((float)memb[ic],ic,r,c);
p=Math.max(p,Math.abs(memb[ic]-membold[ic]));
}
membold=memb;
}
}
m= source.fMeanDouble(mpow);
out.print(" p "+p);
done=(p<eps)||(loop>loopmax);
}
}

private static void fnmcl(double memb[],double[][] m,
int ncl, double[] x)
{
double s[]= new double[ncl], mtemp, dist;
int ic;

for(ic=0;ic<ncl;ic++)s[ic]= Matvec.distsq(x,m[ic]);

for(ic=0; ic<ncl;ic++){
dist=s[ic];
if(dist<epsd){
memb[ic]=1.0;
for(int j=0;j<ncl;j++)if(j!=ic)memb[j]=0.0;
}
else{
mtemp=0.0;
for(int j=0;j<ncl;j++)mtemp+=Math.pow(dist/s[j],recmpwm);
memb[ic]=1.0/mtemp;
}
}
}

-- 
Jonathan G Campbell Univ. Ulster Magee College Derry BT48 7JL N. Ireland 
+44 1504 375367 JG.Campbell@ulst.ac.uk  http://www.infm.ulst.ac.uk/~jgc/