|
Some Software Packages
(from Shi Zhong's research)
All programs listed below are distributed under
GNU GPL
license, thus are free for you to use or distribute. But please be aware
that these programs come with absolutely no warranty what so ever, and in no
way will I be held responsible for loss of properties due to the use of
these software packages. So use it at your own risk.
Model-based Text Clustering
(in Matlab)
- Brief description
- The package contains model-based hard k-means and soft EM clustering
algorithms for text applications. Probabilistic models implemented
include multivariate Bernoulli, multinomial, and von Mises-Fisher
models. Deterministic annealing version of all the above algorithms are
also included.
- Source code
- Text data in Matlab format
- [docdata.zip] For a description of this
data, see the first reference below.
- References
- Shi Zhong and Joydeep Ghosh, "Generative model-based clustering of
documents: a comparative study," Knowledge and Information Systems
(KAIS), Vol. 8, 2005. pp. 374-384.
- Shi Zhong and Joydeep Ghosh. A Unified Framework for Model-based
Clustering. Journal of Machine Learning Research (JMLR).
vol. 4, pp. 1001-1037. November, 2003.
- Shi Zhong and Joydeep Ghosh.
Scalable, balanced model-based
clustering. In SIAM Int. Conf. Data Mining (SDM 2003), pp.
71-82, San Francisco, CA. May 2003.
Online Spherical K-Means Clustering
(OSKM, in C & Matlab)
Coupled Hidden Markov Models (CHMM,
in Matlab)
|