Tim Hunter presented on TensorFrames, which allows users to run TensorFlow models on Apache Spark. Some key points:
- TensorFrames embeds TensorFlow computations into Spark's execution engine to enable distributed deep learning across a Spark cluster.
- It offers performance improvements over other options like Scala UDFs by avoiding serialization and using direct memory copies between processes.
- The demo showed how TensorFrames can leverage GPUs both on Databricks clusters and locally to accelerate numerical workloads like kernel density estimation and deep dream generation.
- Future work includes better integration with Tungsten and MLlib data types as well as official GPU support on Databricks clusters. TensorFrames aims to provide a simple API for