


default search action
14th KDD 2008: Las Vegas, Nevada, USA
- Ying Li, Bing Liu, Sunita Sarawagi:

Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Las Vegas, Nevada, USA, August 24-27, 2008. ACM 2008, ISBN 978-1-60558-193-4 - Benjamin Edelman, Michael Schwarz:

Internet advertising and optimal auction design. 1 - Thore Graepel, Ralf Herbrich:

Large scale data analysis and modelling in online services and advertising. 2 - Trevor Hastie, Jerome H. Friedman, Robert Tibshirani:

Regularization paths and coordinate descent. 3 - Jitendra Malik:

The future of image search. 4 - Udo Miletzki:

Genesis of postal address reading, current state and future prospects: thirty years of pattern recognition on duty of postal services. 5-6
Research papers
- Aris Anagnostopoulos

, Ravi Kumar, Mohammad Mahdian:
Influence and correlation in social networks. 7-15 - Luca Becchetti

, Paolo Boldi
, Carlos Castillo
, Aristides Gionis:
Efficient semi-streaming algorithms for local triangle counting in massive graphs. 16-24 - Indrajit Bhattacharya, Shantanu Godbole, Sachindra Joshi:

Structured entity identification and document categorization: two tasks with one joint model. 25-33 - Albert Bifet

, Ricard Gavaldà
:
Mining adaptively frequent closed unlabeled rooted trees in data streams. 34-42 - Mustafa Bilgic, Lise Getoor:

Effective label acquisition for collective classification. 43-51 - Francesco Bonchi, Carlos Castillo

, Debora Donato, Aristides Gionis:
Topical query decomposition. 52-60 - Christos Boutsidis, Michael W. Mahoney, Petros Drineas

:
Unsupervised feature selection for principal components analysis. 61-69 - Justin Brickell, Vitaly Shmatikov:

The cost of privacy: destruction of data-mining utility in anonymized data publishing. 70-78 - Deepayan Chakrabarti

, Ravi Kumar, Kunal Punera:
Generating succinct titles for web URLs. 79-87 - Soumen Chakrabarti, Rajiv Khanna, Uma Sawant, Chiru Bhattacharyya:

Structured learning for non-smooth ranking losses. 88-96 - Ming-Wei Chang, Wen-tau Yih, Christopher Meek:

Partitioned logistic regression for spam filtering. 97-105 - Jianhui Chen, Shuiwang Ji

, Betul Ceran, Qi Li, Mingrui Wu, Jieping Ye:
Learning subspace kernels for classification. 106-114 - WenYen Chen, Dong Zhang, Edward Y. Chang:

Combinational collaborative filtering for personalized community recommendation. 115-123 - Xue-wen Chen, Michael Wasikowski:

FAST: a roc-based feature selection metric for small samples and imbalanced data classification problems. 124-132 - Haibin Cheng, Pang-Ning Tan

:
Semi-supervised learning with data calibration for long-term time series forecasting. 133-141 - Yong Ju Cho, Naren Ramakrishnan

, Yang Cao:
Reconstructing chemical reaction networks: data mining meets system identification. 142-150 - Peter Christen:

Automatic record linkage using seeded nearest neighbour and support vector machine classification. 151-159 - David J. Crandall

, Dan Cosley, Daniel P. Huttenlocher, Jon M. Kleinberg, Siddharth Suri:
Feedback effects between similarity and social influence in online communities. 160-168 - Kaustav Das, Jeff G. Schneider, Daniel B. Neill

:
Anomaly pattern detection in categorical datasets. 169-176 - Atish Das Sarma, Sreenivas Gollapudi, Samuel Ieong:

Bypass rates: reducing query abandonment using negative inferences. 177-185 - Anirban Dasgupta

, Ravi Kumar, Amit Sasturkar:
De-duping URLs via rewrite rules. 186-194 - Jason V. Davis, Inderjit S. Dhillon:

Structured metric learning for high dimensional problems. 195-203 - Luc De Raedt

, Tias Guns
, Siegfried Nijssen
:
Constraint programming for itemset mining. 204-212 - Charles Elkan, Keith Noto:

Learning classifiers from only positive and unlabeled data. 213-220 - Kave Eshghi, Shyamsundar Rajaram:

Locality sensitive hash functions based on concomitant rank order statistics. 221-229 - Wei Fan, Kun Zhang, Hong Cheng, Jing Gao, Xifeng Yan, Jiawei Han, Philip S. Yu, Olivier Verscheure:

Direct mining of discriminative and essential frequent patterns via model-based search tree. 230-238 - George Forman, Shyamsundar Rajaram:

Scaling up text classification for large file systems. 239-246 - Yasuhiro Fujiwara, Yasushi Sakurai, Masashi Yamamuro:

SPIRAL: efficient and exact model identification for hidden Markov models. 247-255 - Brian Gallagher

, Hanghang Tong
, Tina Eliassi-Rad, Christos Faloutsos
:
Using ghost edges for classification in sparsely labeled networks. 256-264 - Srivatsava Ranjit Ganta, Shiva Prasad Kasiviswanathan, Adam D. Smith:

Composition attacks and auxiliary information in data privacy. 265-273 - Venkatesh Ganti, Arnd Christian König, Rares Vernica:

Entity categorization over large document collections. 274-282 - Jing Gao, Wei Fan, Jing Jiang, Jiawei Han:

Knowledge transfer via multiple model local structure mapping. 283-291 - Gemma C. Garriga, Esa Junttila, Heikki Mannila:

Banded structure in binary matrices. 292-300 - Rohit Gupta, Gang Fang, Blayne Field, Michael S. Steinbach

, Vipin Kumar:
Quantitative evaluation of approximate frequent pattern mining algorithms. 301-309 - Robert J. Hall, Charles Sutton, Andrew McCallum:

Unsupervised deduplication using cross-field dependencies. 310-317 - Meng Hu, Jiong Yang, Wei Su:

Permu-pattern: discovery of mutable permutation patterns with proximity constraint. 318-326 - Heng Huang, Chris H. Q. Ding, Dijun Luo, Tao Li:

Simultaneous tensor subspace selection and clustering: the equivalence of high order svd and k-means clustering. 327-335 - Woochang Hwang, Taehyong Kim, Murali Ramanathan, Aidong Zhang:

Bridging centrality: graph mining from element level to group level. 336-344 - Saara Hyvönen, Pauli Miettinen

, Evimaria Terzi:
Interpretable nonnegative matrix decompositions. 345-353 - Georgiana Ifrim

, Gökhan H. Bakir, Gerhard Weikum:
Fast logistic regression for text categorization with variable-length n-grams. 354-362 - Tomoharu Iwata, Takeshi Yamada, Naonori Ueda:

Probabilistic latent semantic visualization: topic model for visualizing documents. 363-371 - David D. Jensen

, Andrew S. Fast, Brian J. Taylor, Marc E. Maier:
Automatic identification of quasi-experimental designs for discovering causal knowledge. 372-380 - Shuiwang Ji

, Lei Tang, Shipeng Yu, Jieping Ye:
Extracting shared subspace for multi-label classification. 381-389 - Bin Jiang, Jian Pei

, Xuemin Lin
, David W. Cheung, Jiawei Han:
Mining preferences from superior and inferior examples. 390-398 - Ruoming Jin, Muad Abu-Ata, Yang Xiang, Ning Ruan:

Effective and efficient itemset pattern summarization: regression-based approaches. 399-407 - S. Sathiya Keerthi, S. Sundararajan, Kai-Wei Chang, Cho-Jui Hsieh, Chih-Jen Lin

:
A sequential dual method for large scale multi-class linear svms. 408-416 - Jerry Kiernan, Evimaria Terzi:

Constructing comprehensive summaries of large event sequences. 417-425 - Yehuda Koren:

Factorization meets the neighborhood: a multifaceted collaborative filtering model. 426-434 - Gueorgi Kossinets, Jon M. Kleinberg, Duncan J. Watts:

The structure of information pathways in a social communication network. 435-443 - Hans-Peter Kriegel, Matthias Schubert, Arthur Zimek

:
Angle-based outlier detection in high-dimensional data. 444-452 - Srivatsan Laxman, Vikram Tankasali, Ryen W. White:

Stream prediction using a generative model based on frequent episodes in event sequences. 453-461 - Jure Leskovec

, Lars Backstrom, Ravi Kumar, Andrew Tomkins:
Microscopic evolution of social networks. 462-470 - Lei Li, Wenjie Fu, Fan Guo, Todd C. Mowry, Christos Faloutsos

:
Cut-and-stitch: efficient parallel learning of linear dynamical systems on smps. 471-479 - Charles X. Ling, Jun Du:

Active learning with direct query construction. 480-487 - Xiao Ling, Wenyuan Dai, Gui-Rong Xue, Qiang Yang, Yong Yu:

Spectral domain-transfer learning. 488-496 - Xu Ling, Qiaozhu Mei, ChengXiang Zhai, Bruce R. Schatz:

Mining multi-faceted overviews of arbitrary topics in a text collection. 497-505 - Aurélie C. Lozano, Naoki Abe:

Multi-class cost-sensitive boosting with p-norm loss functions. 506-514 - Omid Madani, Jian Huang:

On updates that constrain the features' connections during learning. 515-523 - Mary McGlohon, Leman Akoglu, Christos Faloutsos

:
Weighted graphs and disconnected components: patterns and a generator. 524-532 - Gabriela Moise, Jörg Sander

:
Finding non-redundant, statistically significant regions in high dimensional data: a novel approach to projected and subspace clustering. 533-541 - Ramesh Nallapati, Amr Ahmed, Eric P. Xing, William W. Cohen:

Joint latent topic models for text and citations. 542-550 - Nam Nguyen, Rich Caruana:

Classification with partial labels. 551-559 - Dino Pedreschi

, Salvatore Ruggieri, Franco Turini:
Discrimination-aware data mining. 560-568 - Ian Porteous, David Newman, Alexander Ihler

, Arthur U. Asuncion, Padhraic Smyth
, Max Welling:
Fast collapsed gibbs sampling for latent dirichlet allocation. 569-577 - Hiroto Saigo

, Nicole Krämer
, Koji Tsuda:
Partial least squares regression for graph mining. 578-586 - Issei Sato, Minoru Yoshida, Hiroshi Nakagawa:

Knowledge discovery of semantic relationships between words using nonparametric bayesian graph model. 587-595 - Mukund Seshadri, Sridhar Machiraju, Ashwin Sridharan, Jean Bolot, Christos Faloutsos

, Jure Leskovec:
Mobile call graphs: beyond power-law and lognormal distributions. 596-604 - Qihong Shao, Yi Chen, Shu Tao, Xifeng Yan, Nikos Anerousis:

Efficient ticket routing by resolution sequence mining. 605-613 - Victor S. Sheng, Foster J. Provost, Panagiotis G. Ipeirotis

:
Get another label? improving data quality and data mining using multiple, noisy labelers. 614-622 - Jin Shieh, Eamonn J. Keogh:

iSAX: indexing and mining terabyte sized time series. 623-631 - Ka Cheung Sia, Junghoo Cho, Yun Chi, Belle L. Tseng:

Efficient computation of personal aggregate queries on blogs. 632-640 - György J. Simon, Vipin Kumar, Zhi-Li Zhang:

Semi-supervised approach to rapid and reliable labeling of large data sets. 641-649 - Ajit Paul Singh, Geoffrey J. Gordon:

Relational learning via collective matrix factorization. 650-658 - Xiuyao Song, Chris Jermaine, Sanjay Ranka

, John Gums
:
A bayesian mixture model with linear regression mixing proportions. 659-667 - Liang Sun, Shuiwang Ji

, Jieping Ye:
Hypergraph spectral learning for multi-label classification. 668-676 - Lei Tang, Huan Liu, Jianping Zhang, Zohreh Nazeri:

Community evolution in dynamic multi-mode networks. 677-685 - Hanghang Tong

, Spiros Papadimitriou, Jimeng Sun
, Philip S. Yu, Christos Faloutsos
:
Colibri: fast mining of large static and dynamic graphs. 686-694 - Pedro O. S. Vaz de Melo, Virgílio A. F. Almeida, Antonio Alfredo Ferreira Loureiro:

Can complex network metrics predict the behavior of NBA teams? 695-703 - Daniel David Walker, Eric K. Ringger:

Model-based document clustering with a collapsed gibbs sampler. 704-712 - Pu Wang, Carlotta Domeniconi:

Building semantic kernels for text classification using wikipedia. 713-721 - Michael L. Wick, Khashayar Rohanimanesh, Karl Schultz, Andrew McCallum:

A unified approach for schema matching, coreference and canonicalization. 722-730 - Fei Wu, Raphael Hoffmann, Daniel S. Weld:

Information extraction from Wikipedia: moving down the long tail. 731-739 - Junjie Wu, Hui Xiong, Jian Chen:

SAIL: summation-based incremental learning for information-theoretic clustering. 740-748 - Shan-Hung Wu, Keng-Pei Lin, Chung-Min Chen, Ming-Syan Chen

:
Asymmetric support vector machines: low false-positive learning under the user tolerance. 749-757 - Yang Xiang, Ruoming Jin, David Fuhry, Feodor F. Dragan:

Succinct summarization of transactional databases: an overlapped hyperrectangle scheme. 758-766 - Yabo Xu, Ke Wang, Ada Wai-Chee Fu, Philip S. Yu:

Anonymizing transaction databases for publication. 767-775 - Jian Yang, Ning Zhong, Yiyu Yao, Jue Wang:

Local peculiarity factor and its application in outlier detection. 776-784 - Luh Yen, Marco Saerens, Amin Mantrach, Masashi Shimbo:

A family of dissimilarity measures between nodes generalizing both the shortest-path and the commute-time distances. 785-793 - Chun-Nam John Yu, Thorsten Joachims:

Training structural svms with kernels using sampled cuts. 794-802 - Lei Yu, Chris H. Q. Ding, Steven Loscalzo:

Stable feature selection via dense feature groups. 803-811 - Peng Zhang, Xingquan Zhu

, Yong Shi
:
Categorizing and mining concept drifting data streams. 812-820 - Xiang Zhang, Fei Zou, Wei Wang

:
Fastanova: an efficient algorithm for genome-wide association study. 821-829 - Bin Zhao, Fei Wang, Changshui Zhang:

Cuts3vm: a fast semi-supervised svm algorithm. 830-838 - Zheng Zhao, Jiangxin Wang, Huan Liu, Jieping Ye, Yung Chang:

Identifying biologically relevant genes via multiple heterogeneous data sources. 839-847 - Wenjun Zhou

, Hui Xiong:
Volatile correlation computation: a checkpoint view. 848-856
Industrial papers
- Shyam Boriah, Vipin Kumar, Michael S. Steinbach

, Christopher Potter, Steven A. Klooster:
Land cover change detection: a case study. 857-865 - Mohamed Bouguessa, Benoît Dumoulin, Shengrui Wang:

Identifying authoritative actors in question-answering forums: the case of Yahoo! answers. 866-874 - Huanhuan Cao, Daxin Jiang

, Jian Pei
, Qi He
, Zhen Liao, Enhong Chen
, Hang Li:
Context-aware query suggestion by mining click-through and session data. 875-883 - Christine H. Chih, Douglas Stott Parker Jr.:

The persuasive phase of visualization. 884-892 - Richard Chow, Philippe Golle, Jessica Staddon:

Detecting privacy leaks using corpus-based association rules. 893-901 - Ying Cui, Jennifer G. Dy, Gregory C. Sharp

, Brian M. Alexander, Steve B. Jiang:
Learning methods for lung tumor markerless gating in image-guided radiotherapy. 902-910 - Shantanu Godbole, Shourya Roy:

Text classification, business intelligence, and interactivity: automating C-Sat analysis for services industry. 911-919 - Robert L. Grossman, Yunhong Gu:

Data mining using high performance data clouds: experimental studies using sector and sphere. 920-927 - Shen-Shyang Ho, Ashit Talukder:

Automated cyclone discovery and tracking using knowledge sharing in multiple heterogeneous satellite data. 928-936 - Noam Koenigstein

, Yuval Shavitt
, Tomer Tankel:
Spotting out emerging artists using geo-aware analysis of P2P query strings. 937-945 - Prem Melville, Saharon Rosset, Richard D. Lawrence:

Customer targeting models using actively-selected web content. 946-953 - Fabian Mörchen, Mathäus Dejori, Dmitriy Fradkin, Julien Etienne, Bernd Wachmann, Markus Bundschus:

Anticipating annotations and emerging trends in biomedical literature. 954-962 - G. Niklas Norén

, Andrew Bate, Johan Hopstadius, Kristina Star, I. Ralph Edwards:
Temporal pattern discovery for trends and transient effects: its application to patient records. 963-971 - Nish Parikh, Neel Sundaresan:

Scalable and near real-time burst detection from eCommerce queries. 972-980 - Renuka Sindhgatta

:
Identifying domain expertise of developers from source code. 981-989 - Jie Tang, Jing Zhang, Limin Yao, Juanzi Li, Li Zhang, Zhong Su:

ArnetMiner: extraction and mining of academic social networks. 990-998 - Leonardo Weiss Ferreira Chaves, Erik Buchmann, Klemens Böhm:

Tagmark: reliable estimations of RFID tags for business processes. 999-1007 - Gang Wu, Brendan Kitts:

Experimental comparison of scalable online ad serving. 1008-1015 - Xintian Yang, Sitaram Asur, Srinivasan Parthasarathy

, Sameep Mehta:
A visual-analytic toolkit for dynamic interaction graphs. 1016-1024 - Jieping Ye, Kewei Chen

, Teresa Wu, Jing Li, Zheng Zhao, Rinkal Patel, Min Bae, Ravi Janardan, Huan Liu, Gene E. Alexander, Eric Reiman:
Heterogeneous data fusion for alzheimer's disease study. 1025-1033 - Shipeng Yu, Glenn Fung, Rómer Rosales, Sriram Krishnan, R. Bharat Rao, Cary Dehing-Oberije, Philippe Lambin:

Privacy-preserving cox regression for survival analysis. 1034-1042 - Sai Zeng, Prem Melville, Christian A. Lang, Ioana M. Boier-Martin, Conrad Murphy:

Using predictive analysis to improve invoice-to-cash collection. 1043-1050 - Yi Zhang, Arun C. Surendran, John C. Platt, Mukund Narasimhan:

Learning from multi-topic web documents for contextual advertisement. 1051-1059
Panel
- Ravi Kumar, Alexander Tuzhilin, Christos Faloutsos

, David D. Jensen, Gueorgi Kossinets, Jure Leskovec, Andrew Tomkins:
Social networks: looking ahead. 1060
Demonstrations
- Hendrik Blockeel

, Toon Calders, Élisa Fromont
, Bart Goethals
, Adriana Prado, Céline Robardet:
An inductive database prototype based on virtual mining views. 1061-1064 - Peter Christen:

Febrl -: an open source data cleaning, deduplication and record linkage system with a graphical user interface. 1065-1068 - Luigi Di Caro

, K. Selçuk Candan, Maria Luisa Sapino:
Using tagflake for condensing navigable tag hierarchies from tag clouds. 1069-1072 - Shantanu Godbole, Shourya Roy:

An integrated system for automatic customer satisfaction analysis in the services industry. 1073-1076 - Ming Hua, Jian Pei

:
DiMaC: a disguised missing data cleaning tool. 1077-1080 - Evangelos E. Kotsifakos, Irene Ntoutsi, Yannis Vrahoritis, Yannis Theodoridis:

Pattern-Miner: integrated management and mining over data mining models. 1081-1084 - Hongyan Liu, Hui Yang, Wenbo Li, Wei Wei, Jun He, Xiaoyong Du:

CRO: a system for online review structurization. 1085-1088 - Emmanuel Müller

, Ira Assent
, Ralph Krieger, Timm Jansen, Thomas Seidl
:
Morpheus: interactive exploration of subspace clustering. 1089-1092 - Hill Nguyen, Nish Parikh, Neel Sundaresan:

A software system for buzz-based recommendations. 1093-1096 - Shuyi Zheng, Matthew R. Scott

, Ruihua Song, Ji-Rong Wen:
Pictor: an interactive system for importing data from a website. 1097-1100

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














