1. Understanding Clustering | Rapid - Apache Mahout Clustering designs

Book Overview & Buying
Table Of Contents
Feedback & Rating

Rapid - Apache Mahout Clustering designs

By : Ashish Gupta

5 (1)

Buy this Book

Rapid - Apache Mahout Clustering designs

5 (1)

By: Ashish Gupta

Buy this Book

Overview of this book

As more and more organizations are discovering the use of big data analytics, interest in platforms that provide storage, computation, and analytic capabilities has increased. Apache Mahout caters to this need and paves the way for the implementation of complex algorithms in the field of machine learning to better analyse your data and get useful insights into it. Starting with the introduction of clustering algorithms, this book provides an insight into Apache Mahout and different algorithms it uses for clustering data. It provides a general introduction of the algorithms, such as K-Means, Fuzzy K-Means, StreamingKMeans, and how to use Mahout to cluster your data using a particular algorithm. You will study the different types of clustering and learn how to use Apache Mahout with real world data sets to implement and evaluate your clusters. This book will discuss about cluster improvement and visualization using Mahout APIs and also explore model-based clustering and topic modelling using Dirichlet process. Finally, you will learn how to build and deploy a model for production use.

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Reader feedback

Customer support

Free Chapter

1. Understanding Clustering

The clustering concept

Understanding distance measures

Understanding different clustering techniques

Algorithm support in Mahout

Clustering algorithms in Mahout

Installing Mahout

Preparing data for use with clustering techniques

Summary

2. Understanding K-means Clustering

Learning K-means

Visualizing clusters

Summary

3. Understanding Canopy Clustering

Running Canopy clustering on Mahout

Visualizing clusters

Working with CSV files

Summary

4. Understanding the Fuzzy K-means Algorithm Using Mahout

Learning Fuzzy K-means clustering

Visualizing clusters

Summary

5. Understanding Model-based Clustering

Learning model-based clustering

Running LDA using Mahout

Summary

6. Understanding Streaming K-means

Learning Streaming K-means

Using Mahout for streaming K-means

Summary

7. Spectral Clustering

Understanding spectral clustering

Mahout implementation of spectral clustering

Summary

8. Improving Cluster Quality

Evaluating clusters

Using DistanceMeasure interface

Summary

9. Creating a Cluster Model for Production

Preparing the dataset

Launching the Mahout job on the cluster

Performance tuning for the job

Summary

Index

Customer Reviews

5 (1)

5 star

100%

4 star

3 star

2 star

1 star

Rapid - Apache Mahout Clustering designs

By : Ashish Gupta

Rapid - Apache Mahout Clustering designs

By: Ashish Gupta

Overview of this book

Chapter 1. Understanding Clustering

Confirmation

Buy this book with your credits?

Submit Your Feedback

Create a Free Account To Continue Reading

SignIn Free Account To Continue Reading