Transforming RDDs with the super useful flatMap() API
In this recipe, we examine the flatMap() method which is often a of confusion for beginners; however, on closer examination we demonstrate that it is a clear concept that applies the lambda function to each element just like map, and then flattens the resulting as a single structure (rather than having a list of lists, we create a single list made of all sublist with sublist elements).
How to do it...
- Start a new project in IntelliJ or in an IDE of your choice. Make sure the necessary JAR files are included.
- Set up the package location where the program will reside
package spark.ml.cookbook.chapter3
- Import the necessary packages
import breeze.numerics.pow import org.apache.spark.sql.SparkSession import Array._
- Import the packages for setting up logging level for
log4j. This step is optional, but we highly recommend it (change the level appropriately as you move through the development cycle).
import org.apache.log4j.Logger import org.apache...