


default search action
BigData Conference 2013: Santa Clara, CA, USA
- Xiaohua Hu, Tsau Young Lin, Vijay V. Raghavan, Benjamin W. Wah, Ricardo Baeza-Yates, Geoffrey C. Fox, Cyrus Shahabi, Matthew Smith, Qiang Yang, Rayid Ghani, Wei Fan, Ronny Lempel, Raghunath Nambiar:

2013 IEEE International Conference on Big Data (IEEE BigData 2013), 6-9 October 2013, Santa Clara, CA, USA. IEEE Computer Society 2013, ISBN 978-1-4799-1292-6
Conference Paper Presentations
- Amgad Madkour, Walid G. Aref

, Saleh M. Basalamah
:
Knowledge cubes - A proposal for scalable and semantically-guided management of Big Data. 1-7 - Pascal Bianchi, Stéphan Clémençon, Gemma Morral, Jérémie Jakubowicz:

On-line learning gossip algorithm in multi-agent systems with local decision rules. 6-14 - Peter Sanders, Sebastian Schlag

, Ingo Müller
:
Communication efficient algorithms for fundamental big data problems. 15-23 - Upa Gupta, Leonidas Fegaras:

Map-based graph analysis on MapReduce. 24-30 - Tao Luo, Yin Liao, Guoliang Chen, Yunquan Zhang:

P-DOT: A model of computation for big data. 31-37 - En-Hui Yang, Xiang Yu:

Transparent composite model for large scale image/video processing. 38-44 - Rui Han, Lei Nie, Moustafa Ghanem, Yike Guo

:
Elastic algorithms for guaranteeing quality monotonicity in big data mining. 45-50 - Mario Pastorelli, Antonio Barbuzzi, Damiano Carra, Matteo Dell'Amico

, Pietro Michiardi:
HFSP: Size-based scheduling for Hadoop. 51-59 - Benedikt Elser, Alberto Montresor

:
An evaluation study of BigData frameworks for graph processing. 60-67 - Bryan N. Lawrence

, Victoria L. Bennett, J. Churchill, Martin Juckes, Philip Kershaw
, Stephen Pascoe, Sam Pepler, M. Pritchard, Ag Stephens:
Storing and manipulating environmental big data with JASMIN. 68-75 - Hieu Hanh Le

, Satoshi Hikida, Haruo Yokota
:
Efficient gear-shifting for a power-proportional distributed data-placement method. 76-84 - Patrick Leyshock, David Maier, Kristin Tufte:

Agrios: A hybrid approach to big array analytics. 85-93 - Chun-Hsiang Lee, David Birch, Chao Wu, Dilshan Silva, Orestis Tsinalis, Yang Li, Shulin Yan, Moustafa Ghanem, Yike Guo

:
Building a generic platform for big sensor data application. 94-102 - Jialin Liu, Bradly Crysler, Yin Lu, Yong Chen

:
Locality-driven high-level I/O aggregation for processing scientific datasets. 103-111 - Dheeraj Kumar

, Marimuthu Palaniswami, Sutharshan Rajasegarar
, Christopher Leckie
, James C. Bezdek, Timothy C. Havens
:
clusiVAT: A mixed visual/numerical clustering algorithm for big data. 112-117 - Toshimori Honjo, Kazuki Oikawa:

Hardware acceleration of Hadoop MapReduce. 118-124 - Mian Lu, Lei Zhang, Huynh Phung Huynh, Zhongliang Ong, Yun Liang, Bingsheng He

, Rick Siow Mong Goh, Richard Huynh:
Optimizing the MapReduce framework on Intel Xeon Phi coprocessor. 125-130 - Eugen Feller, Lavanya Ramakrishnan, Christine Morin:

On the performance and energy efficiency of Hadoop deployment models. 131-136 - D. Michael Freemon:

Optimizing throughput on guaranteed-bandwidth WAN networks for the Large Synoptic Survey Telescope (LSST). 137-142 - Takuya Araki, Kazuyo Narita, Hiroshi Tamano:

Feliss: Flexible distributed computing framework with light-weight checkpointing. 143-149 - Jonas Dias, Eduardo S. Ogasawara

, Daniel de Oliveira, Fábio Porto, Patrick Valduriez, Marta Mattoso
:
Algebraic dataflows for big data analysis. 150-155 - Wei Yan, Yuan Xue, Bradley A. Malin

:
Scalable and robust key group size estimation for reducer load balancing in MapReduce. 156-162 - Chao Yin, Jianzong Wang, Changsheng Xie, Jiguang Wan, Changlin Long, Wenjuan Bi:

Robot: An efficient model for big data storage systems based on erasure coding. 163-168 - Chao Chen, Michael Lang

, Yong Chen
:
Multilevel Active Storage for big data applications in high performance computing. 169-174 - Chandima Hewa Nadungodage, Yuni Xia, Jaehwan John Lee

, Myungcheol Lee
, Choon Seo Park:
GPU accelerated item-based collaborative filtering for big-data applications. 175-180 - GuiXin Guo, Shuang Qiu, Zhiqiang Ye, Bingqiang Wang, Lin Fang, Mian Lu, Simon See, Rui Mao:

GPU-accelerated adaptive compression framework for genomics data. 181-186 - Deepal Jayasinghe, Josh Kimball, Tao Zhu, Siddharth Choudhary, Calton Pu:

An infrastructure for automating large-scale performance studies and data processing. 187-192 - Li-Yung Ho, Tsung-Han Li, Jan-Jan Wu, Pangfeng Liu

:
Kylin: An efficient and scalable graph data processing system. 193-198 - Qunzhi Zhou, Yogesh Simmhan

, Viktor K. Prasanna:
Towards hybrid online on-demand querying of realtime data with stateful complex event processing. 199-205 - Jiaran Zhang, Xiaohui Yu

, Yang Liu, Liwei Lin:
DDSN: Duplicate detection to reduce both storage and bandwidth consumption. 206-211 - Aalap Tripathy

, Ka Chon Ieong, Atish Patra, Rabi N. Mahapatra:
A reconfigurable computing architecture for semantic information filtering. 212-218 - Oyindamola O. Akande, Philip J. Rhodes:

Iteration aware prefetching for unstructured grids. 219-227 - Elad Yom-Tov

, Mounia Lalmas, Ricardo Baeza-Yates
, Georges Dupret, Janette Lehmann, Pinar Donmez:
Measuring inter-site engagement. 228-236 - Ting Chen, Kenjiro Taura

:
A selective checkpointing mechanism for query plans in a parallel database system. 237-245 - Kyumars Sheykh Esmaili, Lluis Pamies-Juarez, Anwitaman Datta

:
CORE: Cross-object redundancy for efficient data repair in storage systems. 246-254 - Nikolaos Papailiou, Ioannis Konstantinou

, Dimitrios Tsoumakos, Panagiotis Karras, Nectarios Koziris:
H2RDF+: High-performance distributed joins over large-scale RDF graphs. 255-263 - Austin R. Benson

, David F. Gleich, James Demmel:
Direct QR factorizations for tall-and-skinny matrices in MapReduce architectures. 264-272 - Radu Tudoran, Alexandru Costan

, Ramin Rezai Rad, Goetz Brasche, Gabriel Antoniu:
Adaptive file management for scientific workflows on the Azure cloud. 273-281 - Tian Guo, Thanasis G. Papaioannou

, Karl Aberer:
Model-view sensor data management in the cloud. 282-290 - Anthony D. Fox, Christopher N. Eichelberger, James N. Hughes, Skylar Lyon:

Spatio-temporal indexing in non-relational distributed databases. 291-299 - Lefteris Sidirourgos, Martin L. Kersten, Peter Boncz:

Scientific discovery through weighted sampling. 300-306 - Stefan Pröll, Andreas Rauber:

Scalable data citation in dynamic, large databases: Model and reference implementation. 307-312 - Krish K. R., Aleksandr Khasymski, Guanying Wang, Ali Raza Butt

, Gaurav Makkar:
On the use of shared storage in shared-nothing environments. 313-318 - Alexander Artikis, Matthias Weidlich

, Avigdor Gal, Vana Kalogeraki
, Dimitrios Gunopulos
:
Self-adaptive event recognition for intelligent transport management. 319-325 - Leonardo Arturo Bautista-Gomez

, Franck Cappello:
Improving floating point compression through binary masks. 326-331 - Junjie Chen, Philip C. Roth, Yong Chen

:
Using pattern-models to guide SSD deployment for Big Data applications in HPC systems. 332-337 - Zhiquan Liu, Luo Luo, Wu-Jun Li:

Robust crowdsourced learning. 338-343 - Jialin Liu, Surendra Byna

, Yong Chen
:
Segmented analysis for reducing data movement. 344-349 - Simon Chan, Philip C. Treleaven, Licia Capra:

Continuous hyperparameter optimization for large-scale recommender systems. 350-358 - Hoang Vu Nguyen, Emmanuel Müller

, Klemens Böhm:
4S: Scalable subspace search scheme overcoming traditional Apriori processing. 359-367 - Lars Arge, Michael T. Goodrich

, Freek van Walderveen:
Computing betweenness centrality in external memory. 368-375 - Rong Gu, Furao Shen, Yihua Huang:

A parallel computing platform for training large scale neural networks. 376-384 - Raghvendra Mall

, Rocco Langone
, Johan A. K. Suykens
:
Self-tuned kernel spectral clustering for large scale networks. 385-393 - Yuichiro Yasui, Katsuki Fujisawa

, Kazushige Goto:
NUMA-optimized parallel breadth-first search on multicore single-node system. 394-402 - Arash Fard, M. Usman Nisar, Lakshmish Ramaswamy, John A. Miller

, Matthew Saltz:
A distributed vertex-centric approach for pattern matching in massive graphs. 403-411 - Lee Parnell Thompson, Weijia Xu, Daniel P. Miranker:

Fast scalable selection algorithms for large scale data. 412-420 - Yoshiki Sakai, Kenji Yamanishi

:
An NML-based model selection criterion for general relational data modeling. 421-429 - Rajiv Khanna, Liang Zhang, Deepak Agarwal, Bee-Chung Chen:

Parallel matrix factorization for binary response. 430-438 - Desheng Zhang, Tian He, Yunhuai Liu, John A. Stankovic:

CallCab: A unified recommendation system for carpooling and regular taxicab services. 439-447 - Abhirup Chakraborty:

Top-K aggregation over a large graph using shared-nothing systems. 448-457 - Nemanja Djuric, Mihajlo Grbovic, Slobodan Vucetic:

Distributed confidence-weighted classification on MapReduce. 458-466 - Zhiwei Yu, Raymond K. Wong, Chi-Hung Chi:

Scalable context-aware role mining with MapReduce. 467-474 - Yusheng Xie, Zhengzhang Chen

, Kunpeng Zhang, Chen Jin, Yu Cheng, Ankit Agrawal
, Alok N. Choudhary:
Elver: Recommending Facebook pages in cold start situation without content features. 475-479 - Paul Logasa Bogen, Christopher T. Symons, Amber McKenzie, Robert M. Patton

, Robert E. Gillen
:
Massively scalable near duplicate detection in streams of documents using MDSH. 480-486 - Ahmet Erdem Sariyüce, Kamer Kaya, Erik Saule, Ümit V. Çatalyürek:

Incremental algorithms for closeness centrality. 487-492 - Bo Zhang, Zhongzhi Shi:

Classification of big velocity data via cross-domain Canonical Correlation Analysis. 493-498 - Frank K. H. A. Dehne, Q. Kong, Andrew Rau-Chaplin

, Hamidreza Zaboli, R. Zhou:
A distributed tree data structure for real-time OLAP on cloud architectures. 499-505 - Jiangling Yin, Andrew Foran, Jun Wang

:
DL-MPI: Enabling data locality computation for MPI-based data-intensive applications. 506-511 - Chenxia Wu, Haiqin Yang

, Jianke Zhu, Jiemi Zhang, Irwin King
, Michael R. Lyu:
Sparse Poisson coding for high dimensional document clustering. 512-517 - Martin Weidner, Jonathan Dees, Peter Sanders:

Fast OLAP query execution in main memory on large data in a cluster. 518-524 - Xudong Zhang, Wayne Xin Zhao, Dongdong Shan, Hongfei Yan:

Group-Scheme: SIMD-based compression algorithms for web text data. 525-530 - Chun-Chieh Chen, Kuan-Wei Lee, Chih-Chieh Chang

, De-Nian Yang
, Ming-Syan Chen
:
Efficient large graph pattern mining for big data in the cloud. 531-536 - Rui Wang, Kenneth Chiu:

A stream partitioning approach to processing large scale distributed graph datasets. 537-542 - Richard McCreadie, Craig Macdonald, Iadh Ounis

, Miles Osborne, Sasa Petrovic:
Scalable distributed event detection for Twitter. 543-549 - Barbara Furletti, Lorenzo Gabrielli, Chiara Renso

, Salvatore Rinzivillo
:
Analysis of GSM calls data for understanding user mobility behavior. 550-555 - Haizhou Fu, HyeongSik Kim, Kemafor Anyanwu

:
Scaling concurrency of personalized Semantic search over Large RDF data. 556-562 - Hui Miao, Xiangyang Liu, Bert Huang, Lise Getoor:

A hypergraph-partitioned vertex programming approach for large-scale consensus optimization. 563-568 - Simon Price

, Peter A. Flach
:
A Higher-order data flow model for heterogeneous Big Data. 569-574 - Daniel Trabold, Henrik Grosskreutz:

Parallel subgroup discovery on computing clusters - First results. 575-579 - Darakhshan J. Mir, Sibren Isaacman

, Ramón Cáceres, Margaret Martonosi, Rebecca N. Wright:
DP-WHERE: Differentially private modeling of human mobility. 580-588 - Min-Sheng Lin, Chien-Yi Chiu, Yuh-Jye Lee

, Hsing-Kuo Pao:
Malicious URL filtering - A big data application. 589-596 - Maryam Shoaran, Alex Thomo

, Jens H. Weber-Jahnke:
Zero-knowledge private graph summarization. 597-605 - Lei Shi, Qi Liao, Xiaohua Sun, Yarui Chen, Chuang Lin:

Scalable network traffic visualization using compressed graphs. 606-612 - Duncan Hodges

, Sadie Creese:
Breaking the Arc: Risk control for Big Data. 613-621 - Tim Hegeman, Bogdan Ghit, Mihai Capota

, Jan Hidders
, Dick H. J. Epema, Alexandru Iosup
:
The BTWorld use case for big data analytics: Description, MapReduce logical workflow, and empirical evaluation. 622-630 - Bin Liu, Haifeng Chen, Abhishek B. Sharma, Guofei Jiang, Hui Xiong:

Modeling heterogeneous time series dynamics to profile big sensor data in complex physical systems. 631-638 - Wei Lu, Gang Chen, Anthony K. H. Tung, Feng Zhao:

Efficiently extracting frequent subgraphs using MapReduce. 639-647 - Diego Pennacchioli, Michele Coscia, Salvatore Rinzivillo

, Dino Pedreschi
, Fosca Giannotti:
Explaining the product range effect in purchase data. 648-656 - Natasha Balac, Tamara B. Sipes, Nicole Wolter, Kenneth Nunes, Robert S. Sinkovits, Homa Karimabadi:

Large Scale predictive analytics for real-time energy management. 657-664 - Geoffrey C. Fox, Deepak R. Mani

, Saumyadipta Pyne:
Parallel deterministic annealing clustering and its application to LC-MS data analysis. 665-673 - Diana Moise, Denis Shestakov, Gylfi Thor Gudmunsson, Laurent Amsaleg:

Terabyte-scale image similarity search: Experience and best practice. 674-682 - Matthieu-P. Schapranow

, Hasso Plattner:
HIG - An in-memory database platform enabling real-time analyses of genome data. 691-696 - András Garzó, András A. Benczúr, Csaba István Sidló, Daniel Tahara, Erik Francis Wyatt:

Real-time streaming mobility analytics. 697-702 - Andrew Rau-Chaplin

, Blesson Varghese, Duane Wilson, Zhimin Yao, Norbert Zeh:
QuPARA: Query-driven large-scale portfolio aggregate risk analysis on MapReduce. 703-709 - Mauricio A. Hernández, Kirsten Hildrum, Prateek Jain, Rohit Wagle, Bogdan Alexe, Rajasekar Krishnamurthy, Ioana Roxana Stanoi, Chitra Venkatramani:

Constructing consumer profiles from social media data. 710-716 - Chien-Chih Chen, Yu-Jung Chang, Wei-Chun Chung, Der-Tsai Lee, Jan-Ming Ho:

CloudRS: An error correction algorithm of high-throughput sequencing data based on scalable framework. 717-722 - Jungsuk Kwac, Ram Rajagopal:

Demand response targeting using big data analytics. 683-690 - Adrian Albert, Ram Rajagopal:

Building dynamic thermal profiles of energy consumption for individuals and neighborhoods. 723-728 - Peter Bajcsy, Antoine Vandecreme, Julien Amelot, Phuong Nguyen, Joe Chalfoun, Mary Brady:

Terabyte-sized image computations on Hadoop cluster platforms. 729-737 - Ron Begleiter, Yuval Elovici, Yona Hollander, Ori Mendelson, Lior Rokach, Roi Saltzman:

A fast and scalable method for threat detection in large-scale DNS logs. 738-741 - Matthew Hayes, Sam Shah:

Hourglass: A library for incremental processing on Hadoop. 742-752 - Qi Guo, Yan Li, Tao Liu, Kun Wang, Guancheng Chen, Xiaoming Bao, Wentao Tang:

Correlation-based performance analysis for full-system MapReduce optimization. 753-761 - Mihajlo Grbovic, Jon Malkin, Hirakendu Das:

Large scale ad latency analysis. 762-767 - Alessandro Morari, Vito Giovanni Castellana, David Haglin, John Feo, Jesse Weaver, Antonino Tumeo

, Oreste Villa:
Accelerating semantic graph databases on commodity clusters. 768-772 - Peter Lubell-Doughtie, Jon Sondag:

Practical distributed classification using the Alternating Direction Method of Multipliers algorithm. 773-776 - Varun Sharma, Jeremy Carroll, Abhi Khune:

Scaling deep social feeds at Pinterest. 777-783 - Thibaud Chardonnens, Philippe Cudré-Mauroux

, Martin Grund, Benoit Perroud:
Big data analytics on high Velocity streams: A case study. 784-787
Workshop 1: Distributed Storage Systems and Coding for Big Data
- Iryna Andriyanova, Alan Jule, Emina Soljanin

:
The Code rebalancing problem for a storage-flexible Data Center Network. 1-6 - Wasim Ahmad Bhat

, S. M. K. Quadri:
suvfs: A virtual file system in userspace that supports large files. 7-11 - Antonio Campello, Vinay A. Vaishampayan

:
Reliability of erasure coded storage systems: A geometric approach. 12-16 - Yih-Farn Chen, Scott Daniels, Marios Hadjieleftheriou, Pingkai Liu, Chao Tian, Vinay A. Vaishampayan

:
Distributed storage evaluation on a three-wide inter-data center deployment. 17-22 - Vinay Deolalikar:

Paired-replicas with constant repair time: Loss functions and memorylessness. 23-27 - Kyumars Sheykh Esmaili, Aatish Chiniah, Anwitaman Datta

:
Efficient updates in cross-object erasure-coded storage systems. 28-32 - Hanxu Hou, Kenneth W. Shum, Hui Li

:
Construction of exact-BASIC codes for distributed storage systems at the MSR point. 33-38 - Xianxia Huang, Hui Li

, Tai Zhou, Yumeng Zhang, Han Guo, Hanxu Hou, Huayu Zhang, Kai Lei
:
Minimum storage BASIC codes: A system perspective. 39-43 - Youngjae Kim, Scott Atchley, Geoffroy Vallée, Galen M. Shipman:

Layout-aware I/O Scheduling for terabits data movement. 44-51
Workshop 2: Big Data and the Humanities
- Alberto Acerbi

, Vasileios Lampos
, R. Alexander Bentley
:
Robustness of emotion extraction from 20th century English books. 1-8 - Neal Audenaert, Natalie M. Houston:

VisualPage: Towards large scale analysis of nineteenth-century print culture. 9-16 - Tobias Blanke

, Michael Bryant
, Mark Hedges
:
Back to our data - Experiments with NoSQL technologies in the Humanities. 17-20 - Sheryl Grant

, Richard Marciano
, Priscilla Ndiaye, Kristan E. Shawgo, Jefferson R. Heard:
The human face of crowdsourcing: A citizen-led crowdsourcing case study. 21-24 - Kathleen Kerr, Bernice L. Hausman, Samah Gad, Waqas Javen:

Visualization and rhetoric: Key concerns for utilizing big data in humanities research: A case study of vaccination discourses: 1918-1919. 25-32 - Amalia S. Levi

:
Humanities 'big data': Myths, challenges, and lessons. 33-36 - Ben Miller, Ayush Shrestha, Jason Derby, Jennifer Olive, Karthikeyan Umapathy

, Fuxin Li, Yanjun Zhao:
Digging into human rights violations: Data modelling and collective memory. 37-45 - Vu Dung Nguyen, Blesson Varghese, Adam Barker:

The royal birth of 2013: Analysing and visualising public sentiment in the UK using Twitter. 46-54 - Andrew Prescott

:
Bibliographic records as humanities big data. 55-58 - C. J. Rupp, Paul Rayson

, Alistair Baron
, Christopher Donaldson
, Ian N. Gregory
, Andrew Hardie
, Patricia Murrieta-Flores
:
Customising geoparsing and georeferencing for historical texts. 59-62 - Jedrzej Rybicki, Benedikt von St. Vieth

, Daniel Mallmann
:
A concept of Generic Workspace for Big Data Processing in Humanities. 63-70 - W. Brent Seales, Steve Crossan, Mark Yoshitake, Sertan Girgin:

From assets to stories via the Google Cultural Institute Platform. 71-76 - Susan Brown

, John Simpson:
The curious identity of Michael Field and its implications for humanities research with the semantic web. 77-85 - David A. Smith, Ryan Cordell, Elizabeth Maddock Dillon:

Infectious texts: Modeling text reuse in nineteenth-century newspapers. 86-94 - Ted Underwood, Michael L. Black, Loretta Auvil, Boris Capitanu:

Mapping mutable genres in structurally complex volumes. 95-103 - Lu Xiao

, Yan Luo, Steven High:
CKM: A shared visual analytical tool for large-scale analysis of audio-video interviews. 104-112 - Weijia Xu, Maria Esteva

, Jessica Trelogan
, Todd Swinson:
A case study on entity Resolution for Distant Processing of big Humanities data. 113-120
Workshop 3: Workshop on Big Data and Society
- Vinay Deolalikar:

Enterprise pre-sales forums: A preliminary study of metadata and content. 1-4 - Roman Ferrando-Llopis, David López-Berzosa, Catherine Mulligan

:
Advancing value creation and value capture in data-intensive contexts. 5-9 - Wen-Chiao Hsu, Jyun-Yao Huang, Chi-Hao Chen, Chien-Yu Su, Hsiao-Chen Shih, Tzu-Ya Liao, I-En Liao:

A cloud service for the evaluation of company's financial health using XBRL-based financial statements. 10-14 - Janez Kranjc, Vid Podpecan, Nada Lavrac:

Real-time data analysis in ClowdFlows. 15-22 - Udo Kroon:

Ma3tch: Privacy and knowledge: 'Dynamic networked collective intelligence'. 23-31 - F. Canari Pembe Muhtaroglu, Seniz Demir, Murat Obali, Canan Girgin:

Business model canvas perspective on big data applications. 32-37 - Pantelis Koutroumpis, Aija Leiponen:

Understanding the value of (big) data. 38-42 - Slobodanka Dana Kathrin Tomic, Anna Fensel

:
OpenFridge: A platform for data economy for energy efficiency data. 43-47 - Wen Zhou, Shutao Ye, Xiaolong Lu:

A study of innovation network database Construction by using big data and an enterprise strategy model. 48-52 - Chao Wu

, Yike Guo
:
Enhanced user data privacy with pay-by-data model. 53-57 - Helen X. Xiang:

Query optimization over a heterogeneously distributed scientific database. 58-64 - Wuheng Luo:

Enterprise data economy: A hadoop-driven model and strategy. 65-70
Workshop 4: The First Workshop on Benchmarks, Performance Optimization, and Emerging hardware of Big Data Systems and Applications (BPOE 2013)
- Wei-Chun Chung, Yu-Jung Chang, Chien-Chih Chen, Der-Tsai Lee, Jan-Ming Ho:

Optimizing a MapReduce module of preprocessing high-throughput DNA sequencing data. 1-6 - Tyler Clemons, S. M. Faisal, Shirish Tatikonda, Charu C. Aggarwal, Srinivasan Parthasarathy

:
Hash in a flash: Hash tables for flash devices. 7-14 - Martin Dimitrov, Karthik Kumar, Patrick Lu, Vish Viswanathan, Thomas Willhalm:

Memory system characterization of big data workloads. 15-22 - Yaakoub El Khamra, Niall Gaffney, David Walling, Eric A. Wernert, Weijia Xu, Hui Zhang:

Performance evaluation of R with Intel Xeon Phi coprocessor. 23-30 - Jing Quan, Yingjie Shi, Ming Zhao, Wei Yang:

The implications from benchmarking three big data systems. 31-38 - Taoying Liu, Jing Liu, Hong Liu, Wei Li:

A performance evaluation of Hive for scientific data management. 39-46 - Shengyuan Liu, Jungang Xu, Zongzhen Liu, Xu Liu:

Evaluating task scheduling in hadoop-based cloud systems. 47-53 - Xi Luo, Walid A. Najjar

, Vagelis Hristidis
:
Efficient near-duplicate document detection using FPGAs. 54-61 - Stephan Müller, Lars Butzmann, Stefan Klauck, Hasso Plattner:

Workload-aware aggregate maintenance in columnar in-memory databases. 62-69 - Fengfeng Ning, Chuliang Weng, Yuan Luo:

Virtualization I/O optimization based on shared memory. 70-77 - Pengfei Chen, Yong Qi, Xinyi Li, Li Su:

An ensemble MIC-based approach for performance diagnosis in big data platform. 78-85 - Shinichi Yamagiwa, Hiroshi Sakamoto:

A reconfigurable stream compression hardware based on static symbol-lookup table. 86-93 - Dong Yang, Xiang Zhong, Dong Yan, Fangqin Dai, Xusen Yin, Cheng Lian, Zhongliang Zhu, Weihua Jiang, Gansha Wu:

NativeTask: A Hadoop compatible framework for high performance. 94-101 - Tao Zhong, Kshitij A. Doshi, Xi Tang, Ting Lou, Zhongyan Lu, Hong Li:

On mixing high-speed updates and in-memory queries: A big-data architecture for real-time analytics. 102-109 - Runlin Zhou, Yingjie Shi, Chunge Zhu:

AxPUE: Application level metrics for power usage effectiveness in data centers. 110-117 - Wen Xiong, Zhibin Yu, Zhendong Bei, Juanjuan Zhao, Fan Zhang, Yubin Zou, Xue Bai, Ye Li, Cheng-Zhong Xu

:
A characterization of big data benchmarks. 118-125
Workshop 5: The First Workshop on Big Data Visualization
- Leilani Battle, Michael Stonebraker, Remco Chang

:
Dynamic reduction of query result sets for interactive visualizaton. 1-8 - Joseph A. Cottam, Andrew Lumsdaine

, Peter Wang:
Overplotting: Unified solutions under Abstract Rendering. 9-16 - Alex Endert, Russ Burtner

, Nick Cramer, Ralph Perko, Shawn D. Hampton, Kristin A. Cook:
Typograph: Multiscale spatial exploration of text documents. 17-24 - Jean-Francois Im

, Felix Giguere Villegas, Michael J. McGuffin:
VisReduce: Fast and responsive incremental information visualization of large datasets. 25-32 - Peter Kristof, Bedrich Benes

, Carol X. Song
, Lan Zhao:
A system for large-scale visualization of streaming Doppler data. 33-40 - Milos Krstajic, Daniel A. Keim:

Visualization of streaming data: Observing change and context in information visualization techniques. 41-47 - Xiaotong Liu, Yifan Hu, Stephen C. North, Han-Wei Shen:

CompactMap: A mental map preserving visual interface for streaming text data. 48-55 - Chris Muelder, Tarik Crnovrsanin, Arnaud Sallaberry

, Kwan-Liu Ma:
Egocentric storylines for visual analysis of large dynamic graphs. 56-62 - Eric Papenhausen, Bing Wang, Sungsoo Ha, Alla Zelenyuk

, Dan Imre, Klaus Mueller:
GPU-accelerated incremental correlation clustering of large data with visual feedback. 63-70 - Florian Reichl, Marc Treib, Rüdiger Westermann:

Visualization of big SPH simulations via compressed octree grids. 71-78 - Zhangye Wang, Chang Chen, Juanxia Zhou, Jiyuan Liao, Wei Chen, Ross Maciejewski:

A novel visual analytics approach for clustering large-scale social data. 79-86 - Frederik Wiehr, Vidya Setlur, Alark Joshi

:
DriveSense: Contextual handling of large-scale route map data for the automobile. 87-94
Workshop 6: Big Data and Science: Infrastructure and Services
- Sandro Fiore

, Cosimo Palazzo
, Alessandro D'Anca
, Ian T. Foster, Dean N. Williams, Giovanni Aloisio
:
A big data analytics framework for scientific data management. 1-8 - Eloy Gonzales, Bun Theang Ong, Koji Zettsu:

Searching inter-disciplinary scientific big data based on latent correlation analysis. 9-12 - Kulsawasd Jitkajornwanich

, Upa Gupta, Sakthi Kumaran Shanmuganathan, Ramez Elmasri, Leonidas Fegaras, John McEnery:
Complete storm identification algorithms from big raw rainfall data using MapReduce framework. 13-20 - Wei Tang, Jared Wilkening, Narayan Desai, Wolfgang Gerlach, Andreas Wilke, Folker Meyer

:
A scalable data analysis platform for metagenomics. 21-26 - Karan Vahi, Mats Rynge, Gideon Juve, Rajiv Mayani, Ewa Deelman:

Rethinking data management for big data scientific workflows. 27-35 - Pengfei Xuan, Yueli Zheng, Sapna Sarupria, Amy W. Apon:

SciFlow: A dataflow-driven model architecture for scientific computing using Hadoop. 36-44
Workshop 7: Scalable Machine Learning: Theory and Applications
- Mohammadreza Babaee, Mihai Datcu, Gerhard Rigoll:

Assessment of dimensionality reduction based on communication channel model; application to immersive information visualization. 1-6 - Bonny Banerjee

, Jayanta K. Dutta:
Hierarchical feature learning from sensorial data by spherical clustering. 7-13 - Bonny Banerjee

, Jayanta K. Dutta:
Efficient learning from explanation of prediction errors in streaming data. 14-20 - Karl Branting:

Distributed Pivot Clustering with arbitrary distance functions. 21-27 - Søren Dahlgaard, Christian Igel, Mikkel Thorup

:
Nearest neighbor classification using bottom-k sketches. 28-34 - Ciro Donalek, S. George Djorgovski, Ashish Mahabal, Matthew J. Graham, Andrew J. Drake, Arun Kumar A., N. Sajeeth Philip, Thomas J. Fuchs, Michael J. Turmon, Michael Ting-Chang Yang, Giuseppe Longo:

Feature selection strategies for classifying high dimensional astronomical data sets. 35-41 - Majed Farrash

, Wenjia Wang:
How data partitioning strategies and subset size influence the performance of an ensemble? 42-49 - William Gu, Jaesik Choi

, Ming Gu, Horst D. Simon
, Kesheng Wu
:
Fast Change Point Detection for electricity market analysis. 50-57 - Hong Gu, Junzhe Cao:

A novel integrated method for human multiplex protein subcellular localization prediction. 58-62 - Hisao Ishibuchi

, Masakazu Yamane, Yusuke Nojima
:
Learning from multiple data sets with different missing attributes and privacy policies: Parallel distributed fuzzy genetics-based machine learning approach. 63-70 - Jiaoyan Chen, Huajun Chen, Xi Chen, Guozhou Zheng, Zhaohui Wu:

Data chaos: An entropy based MapReduce framework for scalable learning. 71-78 - Anthony Kleerekoper

, Mikel Luján, Gavin Brown:
Exploring sketches for probability estimation with sublinear memory. 79-86 - Koji Kumanami, Kazuhiro Seki, Kuniaki Uehara:

Agglomerative co-clustering for synonymous phrases based on common effects and influences. 87-94 - Zhiyuan Lin

, Duen Horng (Polo) Chau
, U Kang:
Leveraging memory mapping for fast and scalable graph computation on a PC. 95-98 - Bingwei Liu, Erik Blasch, Yu Chen, Dan Shen, Genshe Chen:

Scalable sentiment classification for Big Data analysis using Naïve Bayes Classifier. 99-104 - Xuan Liu, Xiaoguang Wang, Stan Matwin

, Nathalie Japkowicz
:
Meta-learning for large scale machine learning with MapReduce. 105-110 - Sandy Moens, Emin Aksehirli, Bart Goethals

:
Frequent Itemset Mining for Big Data. 111-118 - Haoruo Peng, Ding Liang, Cyrus Choi:

Evaluating parallel logistic regression models. 119-126 - Mahmudur Rahman, Mohammad Al Hasan:

Approximate triangle counting algorithms on multi-cores. 127-133 - Anton Slutsky, Xiaohua Hu, Yuan An:

Tree Labeled LDA: A Hierarchical model for web summaries. 134-140 - Kristoffer Stensbo-Smidt

, Christian Igel, Andrew Zirm, Kim Steenstrup Pedersen
:
Nearest neighbour regression outperforms model-based prediction of specific star formation rate. 141-144 - Naveen C. Tewari, Hari M. Koduvely, Sarbendu Guha, Arun Yadav, Gladbin David:

MapReduce implementation of Variational Bayesian Probabilistic Matrix Factorization algorithm. 145-152 - Xusen Yin, Bin Wu, Xiuqin Lin:

A unified framework for predicting attributes and links in social networks. 153-160 - Zijian Zhang, Timothy C. Havens

:
Scalable approximation of kernel fuzzy c-means. 161-168 - Yun Zhu, Yanqing Zhang, Yi Pan

:
Large-scale restricted boltzmann machines on single GPU. 169-174
Workshop 8: Big Data in Bioinformatics and Health Informatics
- Ankit Agrawal

, Reda Al-Bahrani, Mark J. Russo, Jaishankar Raman, Alok N. Choudhary:
Lung transplant outcome prediction using UNOS data. 1-8 - Reda Al-Bahrani, Ankit Agrawal

, Alok N. Choudhary:
Colon cancer survival prediction using ensemble data mining on SEER data. 9-16 - Raghunath Nambiar, Ruchie Bhardwaj, Adhiraaj Sethi, Rajesh Vargheese:

A look at challenges and opportunities of Big Data analytics in healthcare. 17-22 - Mario A. Bochicchio

, Antonella Longo
, Lucia Vaira
, Antonio Malvasi
, Andrea Tinelli
:
Multidimensional analysis of fetal growth curves. 23-28 - Xi Chen, Huajun Chen, Ningyu Zhang, Jiaoyan Chen, Zhaohui Wu:

OWL reasoning over big biomedical data. 29-36 - Aaron Smalter Hall, Jun Huan:

KUChemBio: A repository of computational chemical biology data sets. 37-42 - Shinya Hayashi, Kenjiro Taura

:
Parallel and memory-efficient Burrows-Wheeler transform. 43-50 - Meeyoung Park, Hariprasad Sampathkumar, Bo Luo

, Xue-wen Chen:
Content-based assessment of the credibility of online healthcare information. 51-58 - Christian Seebode, Matthias Ort, Christian R. A. Regenbrecht

, Martin Peuker:
BIG DATA infrastructures for pharmaceutical research. 59-63 - Kiyana Zolfaghar, Naren Meadem, Ankur Teredesai, Senjuti Basu Roy, Si-Chi Chin, Brian Muckian:

Big data solutions for predicting risk-of-readmission for congestive heart failure patients. 64-71
Workshop 9: Scholarly Big Data: Challenges & Issues
- Martine De Cock

, Senjuti Basu Roy, Swapna Savvana, Vani Mandava, Brian Dalessandro, Claudia Perlich, William Cukierski, Benjamin Hamner:
The Microsoft Academic Search challenges at KDD Cup 2013. 1-4 - Philipp Mayr

, Peter Mutschke:
Bibliometric-enhanced retrieval models for big scholarly information systems. 5-8 - Michael E. Payne, Linh Bao Ngo, Amy W. Apon:

Academic publishing as a social media paradigm. 9-12 - (Withdrawn) Big spatial data mining. 13-21

Workshop 10: Scalable Cloud Data Management
- Karamjit Kaur

, Rinkle Rani:
Modeling and querying data in NoSQL databases. 1-7 - Lipyeow Lim:

Elastic data partitioning for cloud-based SQL processing systems. 8-16 - Jiamin Lu

, Ralf Hartmut Güting:
Parallel SECONDO: Practical and efficient mobility data processing in the cloud. 17-25 - Mahsa Mofidpoor, Nematollaah Shiri, Thiruvengadam Radhakrishnan:

Index-based join operations in Hive. 26-33 - Katerina Stamou, Verena Kantere, Jean-Henry Morin:

SLA data management criteria. 34-42
Workshop 11: Big Data and Smarter Cities
- Harish S. Bhat

, Garnet Jason Vaz, Juan C. Meza
:
Fast solution of load shedding problems via a sequence of linear programs. 1-6 - Hongfei Li, Buyue Qian, Dhaivat Parikh

, Arun Hampapur:
Alarm prediction in large-scale sensor networks - A case study in railroad. 7-14 - Alice Marascu, Pascal Pompey, Eric Bouillet, Olivier Verscheure, Michael Wurst, Martin Grund, Philippe Cudré-Mauroux

:
MiSTRAL: An architecture for low-latency analytics on MasSive time series. 15-21 - Timothy H. Savage, Huy T. Vo:

Yellow cabs as red corpuscles. 22-28 - Yogesh Simmhan

, Muhammad Usman Noor:
Scalable prediction of energy consumption using incremental time series clustering. 29-36 - M. Anil Yazici

, Camille Kamga, Abhishek Singhal:
A big data driven model for taxi drivers' airport pick-up decisions in New York City. 37-44
Workshop 12: Knowledge management and Big Data Analytics
- Ruiwen Chen:

Managing massive graphs in relational DBMS. 1-8 - Benoît Denis, Amine Ghrab, Sabri Skhiri:

A distributed approach for graph-oriented multidimensional analysis. 9-16 - Yucong Duan, Yongzhi Wang, Jinpeng Wei, Ajay Kattepur, Wencai Du:

Constructing E-Tourism platform based on service value broker: A knowledge management perspective. 17-24 - Zhenwen Wang, Weidong Xiao, Bin Ge, Hao Xu:

ADraw: A novel social network visualization tool with attribute-based layout and coloring. 25-32 - Yongzhi Wang, Jinpeng Wei, Mudhakar Srivatsa, Yucong Duan, Wencai Du:

IntegrityMR: Integrity assurance framework for big data analytics and management applications. 33-40 - Helen X. Xiang:

Local join optimization over a heterogeneously distributed scientific database. 41-45 - Hao Xu, Weidong Xiao, Daquan Tang, Jiuyang Tang, Zhenwen Wang:

Core-based community evolution in mobile social networks. 46-51 - Xinran Yu, Turgay Korkmaz:

Super-sequence frequent pattern mining on sequential dataset. 52-59 - Yun Wei Zhao, Willem-Jan van den Heuvel, Xiaojun Ye:

Exploring big data in small forms: A multi-layered knowledge extraction of social networks. 60-67 - Xiang Zhao, Bin Ge, Jiuyang Tang, Weidong Xiao, Haichuan Shang:

Provenance comparison for large-scale knowledge discovery. 68-75
Posters
- Peter Bajcsy, Antoine Vandecreme, Mary Brady:

Re-projection of terabyte-sized images. 1 - Daniel Cheng, Peter Schretlen, Nathan Kronenfeld, Neil Bozowsky, William Wright:

Tile based visual analytics for Twitter big data exploratory analysis. 2-4 - HyeongSik Kim, Kemafor Anyanwu

:
Optimizing queries over semantically integrated datasets on MapReduce platforms. 5-6 - Hye-Chung Kum, Ashok Kumar Krishnamurthy, Darshana Pathak, Michael K. Reiter, Stanley C. Ahalt:

Secure Decoupled Linkage (SDLink) system for building a social genome. 7-11 - Lin Li, Saeed Bagheri, Helena Goote, Asif Hasan, Gregg Hazard:

Risk adjustment of patient expenditures: A big data analytics approach. 12-14 - Yunlong Ma, Peng Zhang, Yanan Cao, Li Guo:

Parallel auto-encoder for efficient outlier detection. 15-17 - Teng-Sheng Moh

, SivaNaga Prasad Shola:
New factors for identifying influential bloggers. 18-27 - Masaharu Munetomo, Shintaro Bando:

A scalable infrastructure of interactive evolutionary computation to evolve services online with data. 28 - Anmol Rajpurohit:

Big data for business managers - Bridging the gap between potential and value. 29-31 - Shusaku Tsumoto, Shoji Hirano, Haruko Iwata:

Granularity-based temporal data mining in hospital information system. 32-40 - Mengmeng Yang, Yi Zhou, Qu Zhou, Kai Chen, Jianhua He

, Xiaokang Yang:
Observation of Matthew Effects in Sina Weibo microblogger. 41-43 - Jin Soung Yoo, Douglas Boulware:

A framework of spatial co-location mining on MapReduce. 44 - Wenrong Zeng, Yuhao Yang, Bo Luo

:
Access control for big data using data content. 45-47

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














