Soft vector quantization and the EM algorithm

E Alpaydın - Neural networks, 1998 - Elsevier
Neural networks, 1998Elsevier
The relation between hard c-means (HCM), fuzzy c-means (FCM), fuzzy learning vector
quantization (FLVQ), soft competition scheme (SCS) of Yair et al.(1992) and probabilistic
Gaussian mixtures (GM) have been pointed out recently by Bezdek and Pal (1995). We
extend this relation to their training, showing that learning rules by these models to estimate
the cluster centers can be seen as approximations to the expectation–maximization (EM)
method as applied to Gaussian mixtures. HCM and unsupervised, LVQ use 1-of-c type …
The relation between hard c-means (HCM), fuzzy c-means (FCM), fuzzy learning vector quantization (FLVQ), soft competition scheme (SCS) of Yair et al. (1992) and probabilistic Gaussian mixtures (GM) have been pointed out recently by Bezdek and Pal (1995). We extend this relation to their training, showing that learning rules by these models to estimate the cluster centers can be seen as approximations to the expectation–maximization (EM) method as applied to Gaussian mixtures. HCM and unsupervised, LVQ use 1-of-c type competition. In FCM and FLVQ, membership is the −2/(m−1)th power of the distance. In SCS and GM, Gaussian function is used. If the Gaussian membership function is used, the weighted within-groups sum of squared errors used as the fuzzy objective function corresponds to the maximum likelihood estimate in Gaussian mixtures with equal priors and covariances. The fuzzy clustering method named fuzzy c-means alternating optimization procedure (FCM-AO) proposed to optimize the former is then equivalent to batch EM and SCS's update rule is a variant of the online version of EM. The advantages of the probabilistic framework are: (i) we no longer have spurious spread parameters that needs fine tuning as m in fuzzy vector quantization or β in SCS; instead we have a variance term that has a sound interpretation and that can be estimated from the sample; (ii) EM guarantees that the likelihood does not decrease, thus it converges to the nearest local optimum; (iii) EM also allows us to estimate the underlying distance norm and the cluster priors which we could not with the other approaches. We compare Gaussian mixtures trained with EM with LVQ (HCM), SCS and FLVQ on the IRIS dataset and see that it is more accurate due to its being able to take into account the covariance information. We finally note that vector quantization is generally an intermediate step before finding a final output for which supervision may be possible. Thus, instead of an uncoupled approach where an unsupervised method is used first to find the cluster parameters followed by supervised training of the mapping based on the memberships, we advocate a coupled approach where the cluster parameters and mapping are trained supervised in a coupled way. The uncoupled approach ignores the error at the outputs which may not be ideal.
Elsevier
Showing the best result for this search. See all results