This document discusses Apache Spark, an open-source cluster computing framework. It summarizes that Spark allows for in-memory processing to reduce I/O, is optimized for speed, can operate both in-memory and on disk, supports streaming data and machine learning algorithms, integrates DataFrames and graphs, and can leverage Hadoop for resource management. Major companies like IBM, Cloudera and eBay use Spark for applications like recommendations, business intelligence, and data analytics.
Related topics: