SlideShare a Scribd company logo
© 2017 MapR TechnologiesMapR Confidential 1
Predictive Maintenance Using
Recurrent Neural Networks
Justin Brandenburg
Data Scientist, MapR Data Technologies
© 2017 MapR TechnologiesMapR Confidential 2
© 2017 MapR TechnologiesMapR Confidential 3
• 907B$/y investment until 20201
• 1,6M operational industrial robots in the
world in 20152
• 2.6M by 20201
1: What Everyone Must Know About Industry 4.0, Forbes June 2016
2: International Federation of Robotics (IFR) study World Robotics 2016source: PwC 2016 Global Industry 4.0 Survey
Industry 4.0 is Now – Internet of Things (IoT)
Industry 4.0 systems1:
1. Interoperable
2. Information transparency
3. Technical assistance
4. Decentralized decision making
© 2017 MapR TechnologiesMapR Confidential 4
Predictive Maintenance
• Thinking about predictive maintenance by tying it into a revenue stream.
When your machines are up and running, you’re making money.
• The idea behind predictive maintenance is that the failure patterns of various
types of equipment are predictable. If we can predict when a piece of
hardware will fail accurately, and replace that component before it fails, we
can achieve much higher levels of operational efficiency.
• With many devices now including sensor data and other components that
send diagnosis reports, predictive maintenance using big data becomes
increasingly more accurate and effective.
© 2017 MapR TechnologiesMapR Confidential 5
Data is Generated One Event at a Time
“time” : “6:01.103”,
“event” : “RETWEET”,
“location” :
“lat” : 40.712784,
“lon” : -74.005941
“time: “5:04.120”,
“severity” : “CRITICAL”,
“msg” : “Service down”
“card_num” : 1234,
“merchant” : ”Apple”,
“amount” : 50
© 2017 MapR TechnologiesMapR Confidential 6
Stream
Welding Sensors
Stream
Press Sensors
Stream
Prod Line
Stream
Paint
Topic
Topic
Topic
Topic
IoT Real Time Monitoring
Data Platform
© 2017 MapR TechnologiesMapR Confidential 7
From Monitoring to Classifying to Predicting
• Monitoring still requires active involvement and quick response
– Oil well that is indicating increased temperature or volume
– Network traffic for bot net activity or insider threat
• What are we monitoring?
– Is this behavior normal?
• Based on what we are monitoring, can we predict what will
happen?
© 2017 MapR TechnologiesMapR Confidential 8
Classifying vs Predicting in IoT
• Classification
• Prediction – Can we predict when this non-normal behavior will
occur?
Non-Normal
© 2017 MapR TechnologiesMapR Confidential 9
9
Predictive Maintenance for Industrial Devices
Primary goal: Reduce unplanned downtime
© 2017 MapR TechnologiesMapR Confidential 10
Need for Scale: Deploy to a Real Factory
Tesla Factory photo by Paul Sakuma/AP
© 2017 MapR TechnologiesMapR Confidential 11
Solution Building Blocs
© 2017 MapR TechnologiesMapR Confidential 12
MapR Converged Data Platform
Files, Tables, Streams
together on same platform
Shared Services
On-Premise, In the Cloud, Hybrid
High Availability Real Time Security & Governance Multi-tenancy Disaster Recovery Global Namespace
Converge-X™ Data Fabric
Event Data
Streams
Analytics &
Machine Learning
Engines
Operational
Database
Cloud-scale
Data Store
© 2017 MapR TechnologiesMapR Confidential 13
Batch and Stream Processing on a Single Platform
Batch Feeds
Logistics Databases
MapR-ES
Batch Processing
Extract Features
Build Models
AI & Machine Learning
Topic
MapR-XD
MapR-DB
Device Health
Stream Processing
Derive Features
Apply Models
Operationalize Models
Topic
Topic
Device
Communication
© 2017 MapR TechnologiesMapR Confidential 14
TensorFlow
TensorFlow is an open source software library for numerical
computation using data flow graphs
• Developed by Google, released to open source community in Nov 2015 and
quickly became one of the most popular deep learning frameworks
• Two months after its release it had already become the most popular forked
ML GitHub repository
• Built on C++ with a Python interface
© 2017 MapR TechnologiesMapR Confidential 15
What is a Tensor?
A Tensor is a n-dimensional matrix
• 1D is a vector
• 2D (M x M) matrix/tensor is a square array of numbers (m numbers tall and
m numbers wide)
• M x M x M tensor is a cube array (m tall, m wide, m deep)
© 2017 MapR TechnologiesMapR Confidential 16
Why TensorFlow for this problem?
• TensorFlow has rich documentation
• Works on CPUs and GPUs (most DL frameworks can)
• Versions 1.x and above have increased utility with function
abstraction
• Once model is trained, tested and optimized, it can be deployed
to edge computing structures or containers
© 2017 MapR TechnologiesMapR Confidential 17
Deep Learning Implementations
Convolutional
Neural
Networks
Deep
Neural
Networks
Recurrent
Neural
Networks
providing lift for
classification and
forecasting models
feature extraction
and classification of
images
for sequence of
events (sentences
or time series)
© 2017 MapR TechnologiesMapR Confidential 18
Recurrent Neural Network at a Glance
• A neural network that can be used when your data is treated as a
sequence, where the particular order of the data-points matter
• Sometimes, the input is a sequence and the output is a single
vector, or the other way around.
© 2017 MapR TechnologiesMapR Confidential 19
Recurrent Neural Network Topology
Unrolling through time
© 2017 MapR TechnologiesMapR Confidential 20
What are we working with?
• Challenge: Sensor attached to an
automated manufacturing device
capture position and calibration at each
time stamp. Sensor is capturing real
time data on the device and its current
health. The data is stored for historical
analysis to identify trends and patterns
to determine if any devices need to be
taken out of production for health checks
and maintenance.
• Data: 2,014.dat files that, when
unpackaged, were xml format
© 2017 MapR TechnologiesMapR Confidential 21
Workflow
• Import Data into environment
• Perform data transformations
• Exploration of historical data
• Model Construction
• Model Testing
• Deploy Model into Streaming Consumer
• Integrate Visualization
© 2017 MapR TechnologiesMapR Confidential 22
Data Import
• MapR-FS allows for a user to ingest any file type
– Filesystem offers schema-on-read versus schema-on-write. Schema-on-
write necessitates knowing your schema before data is written and then
when data is read, it comes back in the schema defined up-front.
Schema-on-read allows data loading as-is, with no preprocessing thereby
removing obstacles for data capture.
• Data was uploaded in compressed file into MapR-FS
– Also can be done via NFS or scp
• Extract files from .zip file
• Extracted files were .dat filetypes, converted file extension to .xml
using standard bash commands
© 2017 MapR TechnologiesMapR Confidential 23
Data Transformation
• Using Spark to do bulk file ETL from many xml files to single csv
• XML format
• Dataframe
© 2017 MapR TechnologiesMapR Confidential 24
Data Exploration
• Use Jupyter Notebook for interactive data exploration and model
development
© 2017 MapR TechnologiesMapR Confidential 25
Data Prep and Model Building
• Use Jupyter Notebook for prepping data and developing your
model and setting hyper parameters
© 2017 MapR TechnologiesMapR Confidential 26
Model Training and Testing
• Use Jupyter
Notebook for
training and
testing the model
© 2017 MapR TechnologiesMapR Confidential 27
Model Deployment
• Can take tested RNN model and deploy the model on new data as
it streams from sensor attached to device
• Model will generate an alert if predicted metric exceeds historically
normal threshold
© 2017 MapR TechnologiesMapR Confidential 28
Visualization
• Dashboard can show in
real time trends and
behaviors of real time
sensor data and the next
period prediction
© 2017 MapR TechnologiesMapR Confidential 29
Microservices for Serving Deep Learning Models
Event Streams & DB
Advantages
• Deploy models to production as
microservices
• Use files, real-time streams and
databases in production
• Scales horizontally
• Support both real-time and batch
• May or may not be stateful
© 2017 MapR TechnologiesMapR Confidential 30
Points to Remember
• TF is just one ML tool among many (but a great one)
• Choosing the right one depends on your problem
– Ex: Supervised or Unsupervised learning
• How does this model or solution scale?
• Once a model is optimized and insight gained, how can I deploy
my model to help my organization?
• Tools are never used in isolation, the platform matters!
– Support the Workflow, Not Just Modeling
© 2017 MapR TechnologiesMapR Confidential 31
90+% of Machine Learning
Success is Data Logistics,
Not Learning
© 2017 MapR TechnologiesMapR Confidential 32
Q&A
ENGAGE WITH US
@mapr
Blog: https://mapr.com/blog/
MapR Academy
http://learn.mapr.com/

More Related Content

What's hot (20)

PPTX
Predictive Maintenance
Saama
 
PDF
Robotics unit3 sensors
Janarthanan B
 
PPT
Digital Manufacturing
vinodpandeyaxcend
 
PPTX
Case studies IN RAPID PROTOTYPING
Dorothy Kare
 
PPTX
Automated Guided Vehicle(AGV)
latestapps krsr
 
PPTX
Reliability centered maintenance
Pankaj Singh
 
PPTX
Velocity sensors in_robotics
Manish Dhiman
 
PPT
Actuators.ppt
Akash Maurya
 
PPT
autonomous tractor
Sunith Guraddi
 
PPTX
What is predictive maintenance?
Danko Nikolic
 
PPT
Predictive maintenance
James Shearer
 
PDF
Industry 4.0 at the Sensor Level - Simon Sumner
PROFIBUS and PROFINET InternationaI - PI UK
 
PPT
Mechatronics systems
umarjamil10000
 
PDF
Simulation in manufacturing - SIMANDO
SIMANDO
 
PPT
Industrial Robotics Chap 01 Fundamentals
Kevin Carvalho
 
PPTX
Fundamental of robotic manipulator
snkalepvpit
 
PPTX
Automotive mechatronics
Karla Carballo Valderrábano
 
PDF
IRJET- Car Accident Detection and Notification: An Analytical Survey
IRJET Journal
 
PPTX
PRINCIPLES AND PRACTICES OF MAINTENANCE PLANNING
laxtwinsme
 
Predictive Maintenance
Saama
 
Robotics unit3 sensors
Janarthanan B
 
Digital Manufacturing
vinodpandeyaxcend
 
Case studies IN RAPID PROTOTYPING
Dorothy Kare
 
Automated Guided Vehicle(AGV)
latestapps krsr
 
Reliability centered maintenance
Pankaj Singh
 
Velocity sensors in_robotics
Manish Dhiman
 
Actuators.ppt
Akash Maurya
 
autonomous tractor
Sunith Guraddi
 
What is predictive maintenance?
Danko Nikolic
 
Predictive maintenance
James Shearer
 
Industry 4.0 at the Sensor Level - Simon Sumner
PROFIBUS and PROFINET InternationaI - PI UK
 
Mechatronics systems
umarjamil10000
 
Simulation in manufacturing - SIMANDO
SIMANDO
 
Industrial Robotics Chap 01 Fundamentals
Kevin Carvalho
 
Fundamental of robotic manipulator
snkalepvpit
 
Automotive mechatronics
Karla Carballo Valderrábano
 
IRJET- Car Accident Detection and Notification: An Analytical Survey
IRJET Journal
 
PRINCIPLES AND PRACTICES OF MAINTENANCE PLANNING
laxtwinsme
 

Similar to Predictive Maintenance Using Recurrent Neural Networks (20)

PDF
Map r chicago_advanalytics_oct_meetup
Alan Iovine
 
PPTX
MapR and Machine Learning Primer
Mathieu Dumoulin
 
PPTX
State of the Art Robot Predictive Maintenance with Real-time Sensor Data
Mathieu Dumoulin
 
PPTX
Converged and Containerized Distributed Deep Learning With TensorFlow and Kub...
Mathieu Dumoulin
 
PDF
Using TensorFlow for Machine Learning
Justin Brandenburg
 
PPTX
Machine Learning Success: The Key to Easier Model Management
MapR Technologies
 
PPTX
Streaming Architecture including Rendezvous for Machine Learning
Ted Dunning
 
PDF
Big Data LDN 2017: How to leverage the cloud for Business Solutions
Matt Stubbs
 
PPTX
Geo-Distributed Big Data and Analytics
MapR Technologies
 
PDF
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...
Chris Fregly
 
PDF
Spark and MapR Streams: A Motivating Example
Ian Downard
 
PPTX
MapR Product Update - Spring 2017
MapR Technologies
 
PPTX
Designing data pipelines for analytics and machine learning in industrial set...
DataWorks Summit
 
PPTX
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
MapR Technologies
 
PPTX
CEP - simplified streaming architecture - Strata Singapore 2016
Mathieu Dumoulin
 
PPTX
Real-Time Robot Predictive Maintenance in Action
DataWorks Summit
 
PPTX
Integrating Hadoop into your enterprise IT environment
MapR Technologies
 
PPTX
MapR Edge : Act Locally Learn Globally
ridhav
 
PPTX
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
MapR Technologies
 
PDF
DataOps: An Agile Method for Data-Driven Organizations
Ellen Friedman
 
Map r chicago_advanalytics_oct_meetup
Alan Iovine
 
MapR and Machine Learning Primer
Mathieu Dumoulin
 
State of the Art Robot Predictive Maintenance with Real-time Sensor Data
Mathieu Dumoulin
 
Converged and Containerized Distributed Deep Learning With TensorFlow and Kub...
Mathieu Dumoulin
 
Using TensorFlow for Machine Learning
Justin Brandenburg
 
Machine Learning Success: The Key to Easier Model Management
MapR Technologies
 
Streaming Architecture including Rendezvous for Machine Learning
Ted Dunning
 
Big Data LDN 2017: How to leverage the cloud for Business Solutions
Matt Stubbs
 
Geo-Distributed Big Data and Analytics
MapR Technologies
 
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...
Chris Fregly
 
Spark and MapR Streams: A Motivating Example
Ian Downard
 
MapR Product Update - Spring 2017
MapR Technologies
 
Designing data pipelines for analytics and machine learning in industrial set...
DataWorks Summit
 
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
MapR Technologies
 
CEP - simplified streaming architecture - Strata Singapore 2016
Mathieu Dumoulin
 
Real-Time Robot Predictive Maintenance in Action
DataWorks Summit
 
Integrating Hadoop into your enterprise IT environment
MapR Technologies
 
MapR Edge : Act Locally Learn Globally
ridhav
 
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
MapR Technologies
 
DataOps: An Agile Method for Data-Driven Organizations
Ellen Friedman
 
Ad

Recently uploaded (20)

PDF
apidays Singapore 2025 - Surviving an interconnected world with API governanc...
apidays
 
PDF
Research Methodology Overview Introduction
ayeshagul29594
 
PPTX
Exploring Multilingual Embeddings for Italian Semantic Search: A Pretrained a...
Sease
 
PDF
apidays Singapore 2025 - Streaming Lakehouse with Kafka, Flink and Iceberg by...
apidays
 
PPTX
apidays Helsinki & North 2025 - Running a Successful API Program: Best Practi...
apidays
 
PPTX
apidays Helsinki & North 2025 - API access control strategies beyond JWT bear...
apidays
 
PPTX
SlideEgg_501298-Agentic AI.pptx agentic ai
530BYManoj
 
PPT
Growth of Public Expendituuure_55423.ppt
NavyaDeora
 
PPTX
apidays Singapore 2025 - Generative AI Landscape Building a Modern Data Strat...
apidays
 
PPTX
Numbers of a nation: how we estimate population statistics | Accessible slides
Office for National Statistics
 
PDF
apidays Helsinki & North 2025 - APIs in the healthcare sector: hospitals inte...
apidays
 
PDF
Data Retrieval and Preparation Business Analytics.pdf
kayserrakib80
 
PDF
Context Engineering for AI Agents, approaches, memories.pdf
Tamanna
 
PDF
Using AI/ML for Space Biology Research
VICTOR MAESTRE RAMIREZ
 
PPT
AI Future trends and opportunities_oct7v1.ppt
SHIKHAKMEHTA
 
PDF
apidays Singapore 2025 - Building a Federated Future, Alex Szomora (GSMA)
apidays
 
PDF
Driving Employee Engagement in a Hybrid World.pdf
Mia scott
 
PDF
The European Business Wallet: Why It Matters and How It Powers the EUDI Ecosy...
Lal Chandran
 
PDF
JavaScript - Good or Bad? Tips for Google Tag Manager
📊 Markus Baersch
 
PPTX
ER_Model_with_Diagrams_Presentation.pptx
dharaadhvaryu1992
 
apidays Singapore 2025 - Surviving an interconnected world with API governanc...
apidays
 
Research Methodology Overview Introduction
ayeshagul29594
 
Exploring Multilingual Embeddings for Italian Semantic Search: A Pretrained a...
Sease
 
apidays Singapore 2025 - Streaming Lakehouse with Kafka, Flink and Iceberg by...
apidays
 
apidays Helsinki & North 2025 - Running a Successful API Program: Best Practi...
apidays
 
apidays Helsinki & North 2025 - API access control strategies beyond JWT bear...
apidays
 
SlideEgg_501298-Agentic AI.pptx agentic ai
530BYManoj
 
Growth of Public Expendituuure_55423.ppt
NavyaDeora
 
apidays Singapore 2025 - Generative AI Landscape Building a Modern Data Strat...
apidays
 
Numbers of a nation: how we estimate population statistics | Accessible slides
Office for National Statistics
 
apidays Helsinki & North 2025 - APIs in the healthcare sector: hospitals inte...
apidays
 
Data Retrieval and Preparation Business Analytics.pdf
kayserrakib80
 
Context Engineering for AI Agents, approaches, memories.pdf
Tamanna
 
Using AI/ML for Space Biology Research
VICTOR MAESTRE RAMIREZ
 
AI Future trends and opportunities_oct7v1.ppt
SHIKHAKMEHTA
 
apidays Singapore 2025 - Building a Federated Future, Alex Szomora (GSMA)
apidays
 
Driving Employee Engagement in a Hybrid World.pdf
Mia scott
 
The European Business Wallet: Why It Matters and How It Powers the EUDI Ecosy...
Lal Chandran
 
JavaScript - Good or Bad? Tips for Google Tag Manager
📊 Markus Baersch
 
ER_Model_with_Diagrams_Presentation.pptx
dharaadhvaryu1992
 
Ad

Predictive Maintenance Using Recurrent Neural Networks

  • 1. © 2017 MapR TechnologiesMapR Confidential 1 Predictive Maintenance Using Recurrent Neural Networks Justin Brandenburg Data Scientist, MapR Data Technologies
  • 2. © 2017 MapR TechnologiesMapR Confidential 2
  • 3. © 2017 MapR TechnologiesMapR Confidential 3 • 907B$/y investment until 20201 • 1,6M operational industrial robots in the world in 20152 • 2.6M by 20201 1: What Everyone Must Know About Industry 4.0, Forbes June 2016 2: International Federation of Robotics (IFR) study World Robotics 2016source: PwC 2016 Global Industry 4.0 Survey Industry 4.0 is Now – Internet of Things (IoT) Industry 4.0 systems1: 1. Interoperable 2. Information transparency 3. Technical assistance 4. Decentralized decision making
  • 4. © 2017 MapR TechnologiesMapR Confidential 4 Predictive Maintenance • Thinking about predictive maintenance by tying it into a revenue stream. When your machines are up and running, you’re making money. • The idea behind predictive maintenance is that the failure patterns of various types of equipment are predictable. If we can predict when a piece of hardware will fail accurately, and replace that component before it fails, we can achieve much higher levels of operational efficiency. • With many devices now including sensor data and other components that send diagnosis reports, predictive maintenance using big data becomes increasingly more accurate and effective.
  • 5. © 2017 MapR TechnologiesMapR Confidential 5 Data is Generated One Event at a Time “time” : “6:01.103”, “event” : “RETWEET”, “location” : “lat” : 40.712784, “lon” : -74.005941 “time: “5:04.120”, “severity” : “CRITICAL”, “msg” : “Service down” “card_num” : 1234, “merchant” : ”Apple”, “amount” : 50
  • 6. © 2017 MapR TechnologiesMapR Confidential 6 Stream Welding Sensors Stream Press Sensors Stream Prod Line Stream Paint Topic Topic Topic Topic IoT Real Time Monitoring Data Platform
  • 7. © 2017 MapR TechnologiesMapR Confidential 7 From Monitoring to Classifying to Predicting • Monitoring still requires active involvement and quick response – Oil well that is indicating increased temperature or volume – Network traffic for bot net activity or insider threat • What are we monitoring? – Is this behavior normal? • Based on what we are monitoring, can we predict what will happen?
  • 8. © 2017 MapR TechnologiesMapR Confidential 8 Classifying vs Predicting in IoT • Classification • Prediction – Can we predict when this non-normal behavior will occur? Non-Normal
  • 9. © 2017 MapR TechnologiesMapR Confidential 9 9 Predictive Maintenance for Industrial Devices Primary goal: Reduce unplanned downtime
  • 10. © 2017 MapR TechnologiesMapR Confidential 10 Need for Scale: Deploy to a Real Factory Tesla Factory photo by Paul Sakuma/AP
  • 11. © 2017 MapR TechnologiesMapR Confidential 11 Solution Building Blocs
  • 12. © 2017 MapR TechnologiesMapR Confidential 12 MapR Converged Data Platform Files, Tables, Streams together on same platform Shared Services On-Premise, In the Cloud, Hybrid High Availability Real Time Security & Governance Multi-tenancy Disaster Recovery Global Namespace Converge-X™ Data Fabric Event Data Streams Analytics & Machine Learning Engines Operational Database Cloud-scale Data Store
  • 13. © 2017 MapR TechnologiesMapR Confidential 13 Batch and Stream Processing on a Single Platform Batch Feeds Logistics Databases MapR-ES Batch Processing Extract Features Build Models AI & Machine Learning Topic MapR-XD MapR-DB Device Health Stream Processing Derive Features Apply Models Operationalize Models Topic Topic Device Communication
  • 14. © 2017 MapR TechnologiesMapR Confidential 14 TensorFlow TensorFlow is an open source software library for numerical computation using data flow graphs • Developed by Google, released to open source community in Nov 2015 and quickly became one of the most popular deep learning frameworks • Two months after its release it had already become the most popular forked ML GitHub repository • Built on C++ with a Python interface
  • 15. © 2017 MapR TechnologiesMapR Confidential 15 What is a Tensor? A Tensor is a n-dimensional matrix • 1D is a vector • 2D (M x M) matrix/tensor is a square array of numbers (m numbers tall and m numbers wide) • M x M x M tensor is a cube array (m tall, m wide, m deep)
  • 16. © 2017 MapR TechnologiesMapR Confidential 16 Why TensorFlow for this problem? • TensorFlow has rich documentation • Works on CPUs and GPUs (most DL frameworks can) • Versions 1.x and above have increased utility with function abstraction • Once model is trained, tested and optimized, it can be deployed to edge computing structures or containers
  • 17. © 2017 MapR TechnologiesMapR Confidential 17 Deep Learning Implementations Convolutional Neural Networks Deep Neural Networks Recurrent Neural Networks providing lift for classification and forecasting models feature extraction and classification of images for sequence of events (sentences or time series)
  • 18. © 2017 MapR TechnologiesMapR Confidential 18 Recurrent Neural Network at a Glance • A neural network that can be used when your data is treated as a sequence, where the particular order of the data-points matter • Sometimes, the input is a sequence and the output is a single vector, or the other way around.
  • 19. © 2017 MapR TechnologiesMapR Confidential 19 Recurrent Neural Network Topology Unrolling through time
  • 20. © 2017 MapR TechnologiesMapR Confidential 20 What are we working with? • Challenge: Sensor attached to an automated manufacturing device capture position and calibration at each time stamp. Sensor is capturing real time data on the device and its current health. The data is stored for historical analysis to identify trends and patterns to determine if any devices need to be taken out of production for health checks and maintenance. • Data: 2,014.dat files that, when unpackaged, were xml format
  • 21. © 2017 MapR TechnologiesMapR Confidential 21 Workflow • Import Data into environment • Perform data transformations • Exploration of historical data • Model Construction • Model Testing • Deploy Model into Streaming Consumer • Integrate Visualization
  • 22. © 2017 MapR TechnologiesMapR Confidential 22 Data Import • MapR-FS allows for a user to ingest any file type – Filesystem offers schema-on-read versus schema-on-write. Schema-on- write necessitates knowing your schema before data is written and then when data is read, it comes back in the schema defined up-front. Schema-on-read allows data loading as-is, with no preprocessing thereby removing obstacles for data capture. • Data was uploaded in compressed file into MapR-FS – Also can be done via NFS or scp • Extract files from .zip file • Extracted files were .dat filetypes, converted file extension to .xml using standard bash commands
  • 23. © 2017 MapR TechnologiesMapR Confidential 23 Data Transformation • Using Spark to do bulk file ETL from many xml files to single csv • XML format • Dataframe
  • 24. © 2017 MapR TechnologiesMapR Confidential 24 Data Exploration • Use Jupyter Notebook for interactive data exploration and model development
  • 25. © 2017 MapR TechnologiesMapR Confidential 25 Data Prep and Model Building • Use Jupyter Notebook for prepping data and developing your model and setting hyper parameters
  • 26. © 2017 MapR TechnologiesMapR Confidential 26 Model Training and Testing • Use Jupyter Notebook for training and testing the model
  • 27. © 2017 MapR TechnologiesMapR Confidential 27 Model Deployment • Can take tested RNN model and deploy the model on new data as it streams from sensor attached to device • Model will generate an alert if predicted metric exceeds historically normal threshold
  • 28. © 2017 MapR TechnologiesMapR Confidential 28 Visualization • Dashboard can show in real time trends and behaviors of real time sensor data and the next period prediction
  • 29. © 2017 MapR TechnologiesMapR Confidential 29 Microservices for Serving Deep Learning Models Event Streams & DB Advantages • Deploy models to production as microservices • Use files, real-time streams and databases in production • Scales horizontally • Support both real-time and batch • May or may not be stateful
  • 30. © 2017 MapR TechnologiesMapR Confidential 30 Points to Remember • TF is just one ML tool among many (but a great one) • Choosing the right one depends on your problem – Ex: Supervised or Unsupervised learning • How does this model or solution scale? • Once a model is optimized and insight gained, how can I deploy my model to help my organization? • Tools are never used in isolation, the platform matters! – Support the Workflow, Not Just Modeling
  • 31. © 2017 MapR TechnologiesMapR Confidential 31 90+% of Machine Learning Success is Data Logistics, Not Learning
  • 32. © 2017 MapR TechnologiesMapR Confidential 32 Q&A ENGAGE WITH US @mapr Blog: https://mapr.com/blog/ MapR Academy http://learn.mapr.com/