Spark is an open-source cluster computing framework that provides high performance for both batch and streaming data processing. It addresses limitations of other distributed processing systems like MapReduce by providing in-memory computing capabilities and supporting a more general programming model. Spark core provides basic functionalities and serves as the foundation for higher-level modules like Spark SQL, MLlib, GraphX, and Spark Streaming. RDDs are Spark's basic abstraction for distributed datasets, allowing immutable distributed collections to be operated on in parallel. Key benefits of Spark include speed through in-memory computing, ease of use through its APIs, and a unified engine supporting multiple workloads.