This edited book collects state-of-the-art research related to large-scale data analytics that has been accomplished over the last few years. This is among the first books devoted to this important area based on contributions from diverse scientific areas such as databases, data mining, supercomputing, hardware architecture, data visualization, statistics, and privacy.
There is increasing need for new approaches and technologies that can analyze and synthesize very large amounts of data, in the order of petabytes, that are generated by massively distributed data sources. This requires new distributed architectures for data analysis. Additionally, the heterogeneity of such sources imposes significant challenges for the efficient analysis of the data under numerous constraints, including consistent data integration, data homogenization and scaling, privacy and security preservation. The authors also broaden reader understanding of emerging real-world applications in domains such as customer behavior modeling, graph mining, telecommunications, cyber-security, and social network analysis, all of which impose extra requirements for large-scale data analysis.
Large-Scale Data Analytics is organized in 8 chapters, each providing a survey of an important direction of large-scale data analytics or individual results of the emerging research in the field. The book presents key recent research that will help shape the future of large-scale data analytics, leading the way to the design of new approaches and technologies that can analyze and synthesize very large amounts of heterogeneous data. Students, researchers, professionals and practitioners will find this book an authoritative and comprehensive resource.
Table of Contents
Chapter 1 The Family of Map-Reduce
Chapter 2 Optimization of Massively Parallel Data Flows
Chapter 3 Mining Tera-Scale Graphs with ``Pegasus'': Algorithms and Discoveries
Chapter 4 Customer Analyst for the Telecom Industry
Chapter 5 Machine Learning Algorithm Acceleration Using Hybrid (CPU-MPP) MapReduce Clusters
Chapter 6 Large-Scale Social Network Analysis
Chapter 7 Visual Analysis and Knowledge Discovery for Text
Chapter 8 Practical Distributed Privacy-Preserving Data Analysis at Large Scale