As data collection sources and channels continuous evolve, mining and correlating information from multiple information sources has become a crucial step in data mining and knowledge discovery. On one hand, comparing patterns from different databases and understanding their relationships can be extremely beneficial for applications such as Bioinformatics, Sensor Networking, and Business Intelligence. In particular, important information such as pattern trends and evolving rules buried in each individual database, are very hard to discover by examining a single dataset only whereas comparatively mining multiple databases will enable users to discover interesting patterns across a set of data collections that would not have been possible otherwise. On the other hand, many data mining and data analysis tasks such as classification, regression, and clustering, can significantly improve their performance if information from different sources can be properly leveraged and if the mining process has the power to survey all the data sources involved.
Unleashing the full power of multiple information sources is, however, a very challenging problem, considering that schemas used to represent each data collections might be different (data heterogeneousity), data distributions and patterns underlying different data sources may undergo continuous changes (concept evolving), and mining tasks for each data source might also be different (mining diversity). Even though existing researches have demonstrated several approaches to utilize multiple information sources, these methods are still rather ad-hoc and inadequately address some of the fundamental research issues in this field: (1) Harnessing Complex Data Relationship: Multiple information sources represent a collection of highly correlated data, issues such as data integration, data integration, model integration, and model transferring across different domains, play fundamental roles in supporting KDD from multiple information sources; (2) Integrative and Cooperative Mining: For heterogeneous information sources with diverse mining tasks, the mining should be able to unify all data to generate enhanced global models, as well as help individual data collections to cooperatively achieve their respective mining goals; and (3) Differentiation and Correlation: Differentiate and coordinate the difference between data sources at the knowledge level is one crucial step for users to gain a high-level understanding of their data.
The aim of this workshop is to bring together data mining experts to revisit the problem of pattern discovery from multiple information sources, and identify and synthesize current needs for such purposes. Representative questions to be addressed include but are not limited to:
We solicit two types of papers: Regular paper and Short paper (4 pages for short paper and about 8 pages for regular papers inclusive of all references and figures, however, papers up to 12 pages will also be reviewed and included in proceedings).
All papers should be submitted in ACM proceedings format (two columns, 9pt font, approx. 1in margins). Please follow ACM Proceedings guideline in preparing your paper, which can be found at: http://www.acm.org/sigs/pubs/proceed/template.html.
We strongly encourage authors to prepare their manuscripts in PDF (preferred) or postscript format. Please ensure that any special fonts used are included in the submitted documents.
The workshop proceedings will be published by the ACM Digital Library and distributed during the workshop
Extended versions of selected workshop papers will be published in an edited book (Springer, pending approval)
For submission of the paper, please use Easychair system at http://www.easychair.org/conferences/?conf=mmis08
Please register at Easychair first if you did not use EasyChair before.
If you are experiencing any difficulties, please contact workshop co-chairs. Upon the receiving of each submission, the workshop co-chairs will organize the peer-review process immediately.