Advances in Knowledge Discovery and Data Mining
Lieferbar innerhalb von 2-3 Tagen
BeschreibungThePaci?c-AsiaConferenceonKnowledgeDiscoveryandDataMining(PAKDD) has been held every year since 1997. This year, the eighth in the series (PAKDD 2004) was held at Carlton Crest Hotel, Sydney, Australia, 26-28 May 2004. PAKDD is a leading international conference in the area of data mining. It p- vides an international forum for researchers and industry practitioners to share their new ideas, original research results and practical development experiences from all KDD-related areas including data mining, data warehousing, machine learning, databases, statistics, knowledge acquisition and automatic scienti?c discovery, data visualization, causal induction, and knowledge-based systems. The selection process this year was extremely competitive. We received 238 researchpapersfrom23countries,whichisthehighestinthehistoryofPAKDD, and re?ects the recognition of and interest in this conference. Each submitted research paper was reviewed by three members of the program committee. F- lowing this independent review, there were discussions among the reviewers, and when necessary, additional reviews from other experts were requested. A total of 50 papers were selected as full papers (21%), and another 31 were selected as short papers (13%), yielding a combined acceptance rate of approximately 34%. The conference accommodated both research papers presenting original - vestigation results and industrial papers reporting real data mining applications andsystemdevelopmentexperience.Theconferencealsoincludedthreetutorials on key technologies of knowledge discovery and data mining, and one workshop focusing on speci?c new challenges and emerging issues of knowledge discovery anddatamining.ThePAKDD2004programwasfurtherenhancedwithkeynote speeches by two outstanding researchers in the area of knowledge discovery and data mining: Philip Yu, Manager of Software Tools and Techniques, IBM T.J.
InhaltsverzeichnisInvited Speeches.- Mining of Evolving Data Streams with Privacy Preservation.- Data Mining Grand Challenges.- Session 1A: Classification (I).- Evaluating the Replicability of Significance Tests for Comparing Learning Algorithms.- Spectral Energy Minimization for Semi-supervised Learning.- Discriminative Methods for Multi-labeled Classification.- Session 1B: Clustering (I).- Subspace Clustering of High Dimensional Spatial Data with Noises.- Constraint-Based Graph Clustering through Node Sequencing and Partitioning.- Mining Expressive Process Models by Clustering Workflow Traces.- Session 1C: Association Rules (I).- CMTreeMiner: Mining Both Closed and Maximal Frequent Subtrees.- Secure Association Rule Sharing.- Self-Similar Mining of Time Association Rules.- Session 2A: Novel Algorithms (I).- ParaDualMiner: An Efficient Parallel Implementation of the DualMiner Algorithm.- A Novel Distributed Collaborative Filtering Algorithm and Its Implementation on P2P Overlay Network.- An Efficient Algorithm for Dense Regions Discovery from Large-Scale Data Streams.- Blind Data Linkage Using n-gram Similarity Comparisons.- Condensed Representation of Emerging Patterns.- Session 2B: Association (II).- Discovery of Maximally Frequent Tag Tree Patterns with Contractible Variables from Semistructured Documents.- Mining Term Association Rules for Heuristic Query Construction.- FP-Bonsai: The Art of Growing and Pruning Small FP-Trees.- Mining Negative Rules Using GRD.- Applying Association Rules for Interesting Recommendations Using Rule Templates.- Session 2C: Classification (II).- Feature Extraction and Classification System for Nonlinear and Online Data.- A Metric Approach to Building Decision Trees Based on Goodman-Kruskal Association Index.- DRC-BK: Mining Classification Rules with Help of SVM.- A New Data Mining Method Using Organizational Coevolutionary Mechanism.- Noise Tolerant Classification by Chi Emerging Patterns.- The Application of Emerging Patterns for Improving the Quality of Rare-Class Classification.- Session 3A: Event Mining, Anomaly Detection, and Intrusion Detection.- Finding Negative Event-Oriented Patterns in Long Temporal Sequences.- OBE: Outlier by Example.- Temporal Sequence Associations for Rare Events.- Summarization of Spacecraft Telemetry Data by Extracting Significant Temporal Patterns.- An Extended Negative Selection Algorithm for Anomaly Detection.- Adaptive Clustering for Network Intrusion Detection.- Session 3B: Ensemble Learning.- Ensembling MML Causal Discovery.- Logistic Regression and Boosting for Labeled Bags of Instances.- Fast and Light Boosting for Adaptive Mining of Data Streams.- Compact Dual Ensembles for Active Learning.- On the Size of Training Set and the Benefit from Ensemble.- Session 3C: Bayesian Network and Graph Mining.- Identifying Markov Blankets Using Lasso Estimation.- Selective Augmented Bayesian Network Classifiers Based on Rough Set Theory.- Using Self-Consistent Naive-Bayes to Detect Masquerades.- DB-Subdue: Database Approach to Graph Mining.- Session 3D: Text Mining (I).- Finding Frequent Structural Features among Words in Tree-Structured Documents.- Exploring Potential of Leave-One-Out Estimator for Calibration of SVM in Text Mining.- Classifying Text Streams in the Presence of Concept Drifts.- Using Cluster-Based Sampling to Select Initial Training Set for Active Learning in Text Classification.- Spectral Analysis of Text Collection for Similarity-Based Clustering.- Session 4A: Clustering (II).- Clustering Multi-represented Objects with Noise.- Providing Diversity in K-Nearest Neighbor Query Results.- Cluster Structure of K-means Clustering via Principal Component Analysis.- Combining Clustering with Moving Sequential Pattern Mining: A Novel and Efficient Technique.- An Alternative Methodology for Mining Seasonal Pattern Using Self-Organizing Map.- Session 4B: Association (III).- ISM: Item Selection for Marketing with Cross-Selling Considerations.- Efficient Pattern-Growth Methods for Frequent Tree Pattern Mining.- Mining Association Rules from Structural Deltas of Historical XML Documents.- Data Mining Proxy: Serving Large Number of Users for Efficient Frequent Itemset Mining.- Session 4C: Novel Algorithms (II).- Formal Approach and Automated Tool for Translating ER Schemata into OWL Ontologies.- Separating Structure from Interestingness.- Exploiting Recurring Usage Patterns to Enhance Filesystem and Memory Subsystem Performance.- Session 4D: Multimedia Mining.- Automatic Text Extraction for Content-Based Image Indexing.- Peculiarity Oriented Analysis in Multi-people Tracking Images.- AutoSplit: Fast and Scalable Discovery of Hidden Variables in Stream and Multimedia Databases.- Session 5A: Text Mining and Web Mining (II).- Semantic Sequence Kin: A Method of Document Copy Detection.- Extracting Citation Metadata from Online Publication Lists Using BLAST.- Mining of Web-Page Visiting Patterns with Continuous-Time Markov Models.- Discovering Ordered Tree Patterns from XML Queries.- Predicting Web Requests Efficiently Using a Probability Model.- Session 5B: Statistical Methods, Sequential Data Mining, and Time Series Mining.- CCMine: Efficient Mining of Confidence-Closed Correlated Patterns.- A Conditional Probability Distribution-Based Dissimilarity Measure for Categorial Data.- Learning Hidden Markov Model Topology Based on KL Divergence for Information Extraction.- A Non-parametric Wavelet Feature Extractor for Time Series Classification.- Rules Discovery from Cross-Sectional Short-Length Time Series.- Session 5C: Novel Algorithms (III).- Constraint-Based Mining of Formal Concepts in Transactional Data.- Towards Optimizing Conjunctive Inductive Queries.- Febrl - A Parallel Open Source Data Linkage System.- A General Coding Method for Error-Correcting Output Codes.- Discovering Partial Periodic Patterns in Discrete Data Sequences.- Session 5D: Biomedical Mining.- Conceptual Mining of Large Administrative Health Data.- A Semi-automatic System for Tagging Specialized Corpora.- A Tree-Based Approach to the Discovery of Diagnostic Biomarkers for Ovarian Cancer.- A Novel Parameter-Less Clustering Method for Mining Gene Expression Data.- Extracting and Explaining Biological Knowledge in Microarray Data.- Further Applications of a Particle Visualization Framework.
Untertitel: 8th Pacific-Asia Conference, PAKDD 2004, Sydney, Australia, May 26-28, 2004, Proceedings. 2004. Auflage. Book. Sprache: Englisch.
Erscheinungsdatum: Mai 2004
Seitenanzahl: 740 Seiten