This document discusses using GraphLab Create and Apache Spark together for machine learning applications. It provides an overview of Spark and how to create resilient distributed datasets (RDDs) and perform parallel operations on clusters. It then lists many machine learning algorithms available in GraphLab Create, including recommender systems, classification, regression, text analysis, image analysis, and graph analytics. The document proposes using notebooks to build data science products that help deliver personalized experiences through ML and intelligent automation. It demonstrates clustering customer transactions from an expense reporting dataset to identify customer behavior patterns.
Related topics: