SlideShare a Scribd company logo
UNCLASSIFIED // APPROVED FOR PUBLIC RELEASE 1
Boeing F/A-18F Super Hornet at takeoff at Danish Air Show 2014 on June 22
Image attribution: Slaunger
CC BY-SA 3.0
UNCLASSIFIED // APPROVED FOR PUBLIC RELEASE 2
The views of the author(s) expressed herein do not necessarily
represent those of the U.S. Navy or Department of Defense
(DoD). Presentation of this material does not constitute or imply its
endorsement, recommendation, or favoring by the DoD.
UNCLASSIFIED // APPROVED FOR PUBLIC RELEASE 3
• Team of 20 data scientists and
software engineers
• Navy civil servants and
contractor support personnel
• Naval Air Warfare Center
Training Systems Division in
Orlando, FL
NAWCTSD Enterprise Research Data
Science
UNCLASSIFIED // APPROVED FOR PUBLIC RELEASE 4
Issue
Unscheduled maintenance and
component degradation impacting
readiness
Hornet Health Assessment and Readiness Tool (HhART)
Create a real-time monitoring tool for a fleet of aircraft that is driven by engineering-
approved features and models that can be used to assist maintenance decisions and
predict component degradation
UNCLASSIFIED // APPROVED FOR PUBLIC RELEASE 5
SME ETL Data science Deployment Feedback
UNCLASSIFIED // APPROVED FOR PUBLIC RELEASE 6
SME ETL Data science Deployment Feedback
• Engineers
• Designers
• Maintainers
UNCLASSIFIED // APPROVED FOR PUBLIC RELEASE 7
SME ETL Data science Deployment Feedback
• Engineers
• Designers
• Maintainers
Engagement
Education
Communication
Trust
Culture
EngineersMaintainers
Leaders
Data
scientists
Process
engineers
Developers
UNCLASSIFIED // APPROVED FOR PUBLIC RELEASE 8
SME ETL Data science Deployment Feedback
• Engineers
• Designers
• Maintainers
SES 405 - Exploration Systems Engineering (ASU)
System Hierarchy Module (9)
• Complex system of systems
• Data scientists can support at
each level
• Each component is unique
UNCLASSIFIED // APPROVED FOR PUBLIC RELEASE 9
• Confounding effects
• Inconsistent recording resolutions
• Both discrete and continuous data
• Changing schemas between software versions
SME ETL Data science Deployment Feedback
• Identify
• Acquire
• Load
• Engineers
• Designers
• Maintainers
UNCLASSIFIED // APPROVED FOR PUBLIC RELEASE 10
Batch analysis Streaming
SME ETL Data science Deployment Feedback
• Identify
• Acquire
• Load
• Engineers
• Designers
• Maintainers
Flight data Storage Warehouse
Staging Analysis
Cleaning
Validation
Analysis
Results
Models
Flight data Storage
Analysis
Livedata
Models
Streaming
Logic-based metrics
Dashboard
GitLab is a registered trademark of GitLab, Inc.
Apache Kafka, Kafka, and the Kafka logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other
countries.
UNCLASSIFIED // APPROVED FOR PUBLIC RELEASE 11
SME ETL Data science Deployment Feedback
• Identify
• Acquire
• Load
• Explore
• Develop
• Iterate
• Engineers
• Designers
• Maintainers
• Learn
• Understand the problem domain
• Analyze
• Find indicators of the problem
• Develop
• Features to predict the indicators
• Refine
• Iterate with the SME to better
define the features
Iterate
Learn
Analyze
Develop
Refine
UNCLASSIFIED // APPROVED FOR PUBLIC RELEASE 12
• Logic
• Supply engineers with data science
superpowers
• Enhance error detection
• Moderately predictive
• Deep learning
• Learn what normal behavior is
• Detect complex parameter
interaction
• Highly predictive
SME ETL Data science Deployment Feedback
• Identify
• Acquire
• Load
• Explore
• Develop
• Iterate
• Engineers
• Designers
• Maintainers
Images
created
using
public
NASA
DashLink
dataset
UNCLASSIFIED // APPROVED FOR PUBLIC RELEASE 13
SME ETL Data science Deployment Feedback
• Identify
• Acquire
• Load
• Explore
• Develop
• Iterate
• Engineers
• Designers
• Maintainers
• Anomaly detection
• Finding errant behavior in noisy
signals
• Virtual sensors
• Replicating normal behavior
• Information compression
• Targeting specific interactions
ModelInput
ModelOutput
Reconstruction
Error
Potential
Anomalies
UNCLASSIFIED // APPROVED FOR PUBLIC RELEASE 14
• Feature score normalization
• Aggregate scores
• Seeing trends
SME ETL Data science Deployment Feedback
• Engineers
• Designers
• Maintainers
• Identify
• Acquire
• Load
• Explore
• Develop
• Iterate
• Live
• Targeted
• Relevant
UNCLASSIFIED // APPROVED FOR PUBLIC RELEASE 15
SME ETL Data science Deployment Feedback
• Engineers
• Designers
• Maintainers
• Identify
• Acquire
• Load
• Explore
• Develop
• Iterate
• Live
• Targeted
• Relevant
• Speed
• On-prem vs. cloud
• Automation
• Security
• Hardening containers
• RMF, C-ATO
• Platform One
• AF CSO- software.af.mil
• Live monitoring system
Kubernetes and the Kubernetes logo are registered trademarks of The Linux
Foundation.
UNCLASSIFIED // APPROVED FOR PUBLIC RELEASE 16
• Critical at all stages of the process cycle
• User needs should drive development
• Tool should be accurate and explainable
• Regularly communicate with users
SME ETL Data science Deployment Feedback
• Engineers
• Designers
• Maintainers
• Identify
• Acquire
• Load
• Explore
• Develop
• Iterate
• Live
• Targeted
• Relevant
• Communicate
• Incorporate
• Update
UNCLASSIFIED // APPROVED FOR PUBLIC RELEASE 17
• Continuous
• Tailored to new platforms and
sub-systems
• Adapted to meet the needs of the
user
Collaboration
Learning
Developing
RefiningVisualizing
Deploying
Feedback
UNCLASSIFIED // APPROVED FOR PUBLIC RELEASE 18
Lessons learned
• Data science is most effective when tightly integrated into
organizational structure
• Progress is difficult at best without Data Scientist understanding of
the system
Near-term goals
• Move to include additional platforms
• Deploying HhART at the edge in collaboration with partners
UNCLASSIFIED // APPROVED FOR PUBLIC RELEASE 19

More Related Content

What's hot (20)

PDF
Observability for Data Pipelines With OpenLineage
Databricks
 
PDF
Automated Metadata Management in Data Lake – A CI/CD Driven Approach
Databricks
 
PDF
Empowering Zillow’s Developers with Self-Service ETL
Databricks
 
PDF
NLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobil
Databricks
 
PDF
3D: DBT using Databricks and Delta
Databricks
 
PDF
Challenges of Operationalising Data Science in Production
iguazio
 
PDF
KFServing, Model Monitoring with Apache Spark and a Feature Store
Databricks
 
PDF
Auto-Train a Time-Series Forecast Model With AML + ADB
Databricks
 
PDF
Deploying Python Machine Learning Models with Apache Spark with Brandon Hamri...
Databricks
 
PDF
Advanced Model Comparison and Automated Deployment Using ML
Databricks
 
PDF
Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...
Databricks
 
PDF
Data Driven Decisions at Scale
Databricks
 
PPTX
Graphene – Microsoft SCOPE on Tez
DataWorks Summit
 
PDF
Anomaly Detection at Scale!
Databricks
 
PDF
Automatic Forecasting using Prophet, Databricks, Delta Lake and MLflow
Databricks
 
PDF
Sysml 2019 demo_paper
strange_loop
 
PDF
Lightning-Fast Analytics for Workday Transactional Data with Pavel Hardak and...
Databricks
 
PDF
Unified MLOps: Feature Stores & Model Deployment
Databricks
 
PDF
Semantic Image Logging Using Approximate Statistics & MLflow
Databricks
 
PDF
Getting Ready to Use Redis with Apache Spark with Tague Griffith
Databricks
 
Observability for Data Pipelines With OpenLineage
Databricks
 
Automated Metadata Management in Data Lake – A CI/CD Driven Approach
Databricks
 
Empowering Zillow’s Developers with Self-Service ETL
Databricks
 
NLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobil
Databricks
 
3D: DBT using Databricks and Delta
Databricks
 
Challenges of Operationalising Data Science in Production
iguazio
 
KFServing, Model Monitoring with Apache Spark and a Feature Store
Databricks
 
Auto-Train a Time-Series Forecast Model With AML + ADB
Databricks
 
Deploying Python Machine Learning Models with Apache Spark with Brandon Hamri...
Databricks
 
Advanced Model Comparison and Automated Deployment Using ML
Databricks
 
Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...
Databricks
 
Data Driven Decisions at Scale
Databricks
 
Graphene – Microsoft SCOPE on Tez
DataWorks Summit
 
Anomaly Detection at Scale!
Databricks
 
Automatic Forecasting using Prophet, Databricks, Delta Lake and MLflow
Databricks
 
Sysml 2019 demo_paper
strange_loop
 
Lightning-Fast Analytics for Workday Transactional Data with Pavel Hardak and...
Databricks
 
Unified MLOps: Feature Stores & Model Deployment
Databricks
 
Semantic Image Logging Using Approximate Statistics & MLflow
Databricks
 
Getting Ready to Use Redis with Apache Spark with Tague Griffith
Databricks
 

Similar to Using Apache Spark for Predicting Degrading and Failing Parts in Aviation (20)

PDF
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
DATAVERSITY
 
PDF
Turn Data Into Actionable Insights - StampedeCon 2016
StampedeCon
 
PDF
Advanced Analytics and Machine Learning with Data Virtualization
Denodo
 
PDF
ADV Slides: How to Improve Your Analytic Data Architecture Maturity
DATAVERSITY
 
PDF
From Lab to Factory: Creating value with data
Peadar Coyle
 
PDF
DevOps for DataScience
Stepan Pushkarev
 
PDF
Data science workshop
Hortonworks
 
PDF
How to Build Successful Data Team - Dataiku ?
Dataiku
 
PPTX
Software engineering practices for the data science and machine learning life...
DataWorks Summit
 
PPTX
Building Data Ecosystems for Accelerated Discovery
adamkraut
 
PDF
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
Mihai Criveti
 
PPTX
Log I am your father
DataWorks Summit/Hadoop Summit
 
PDF
Data Science at Scale - The DevOps Approach
Mihai Criveti
 
PPTX
Implementing the Business Catalog in the Modern Enterprise: Bridging Traditio...
DataWorks Summit/Hadoop Summit
 
PDF
Platforming the Major Analytic Use Cases for Modern Engineering
DATAVERSITY
 
PDF
Hortonworks - IBM Cognitive - The Future of Data Science
Thiago Santiago
 
PDF
Lean Analytics: How to get more out of your data science team
Digital Transformation EXPO Event Series
 
PPTX
Data Science with Hadoop - A primer
Ofer Mendelevitch
 
PDF
Reimagining Devon Energy’s Data Estate with a Unified Approach to Integration...
Databricks
 
PDF
The Hitchhiker's Guide to Machine Learning with Python & Apache Spark
Krishna Sankar
 
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
DATAVERSITY
 
Turn Data Into Actionable Insights - StampedeCon 2016
StampedeCon
 
Advanced Analytics and Machine Learning with Data Virtualization
Denodo
 
ADV Slides: How to Improve Your Analytic Data Architecture Maturity
DATAVERSITY
 
From Lab to Factory: Creating value with data
Peadar Coyle
 
DevOps for DataScience
Stepan Pushkarev
 
Data science workshop
Hortonworks
 
How to Build Successful Data Team - Dataiku ?
Dataiku
 
Software engineering practices for the data science and machine learning life...
DataWorks Summit
 
Building Data Ecosystems for Accelerated Discovery
adamkraut
 
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
Mihai Criveti
 
Log I am your father
DataWorks Summit/Hadoop Summit
 
Data Science at Scale - The DevOps Approach
Mihai Criveti
 
Implementing the Business Catalog in the Modern Enterprise: Bridging Traditio...
DataWorks Summit/Hadoop Summit
 
Platforming the Major Analytic Use Cases for Modern Engineering
DATAVERSITY
 
Hortonworks - IBM Cognitive - The Future of Data Science
Thiago Santiago
 
Lean Analytics: How to get more out of your data science team
Digital Transformation EXPO Event Series
 
Data Science with Hadoop - A primer
Ofer Mendelevitch
 
Reimagining Devon Energy’s Data Estate with a Unified Approach to Integration...
Databricks
 
The Hitchhiker's Guide to Machine Learning with Python & Apache Spark
Krishna Sankar
 
Ad

More from Databricks (20)

PPTX
DW Migration Webinar-March 2022.pptx
Databricks
 
PPTX
Data Lakehouse Symposium | Day 1 | Part 1
Databricks
 
PPT
Data Lakehouse Symposium | Day 1 | Part 2
Databricks
 
PPTX
Data Lakehouse Symposium | Day 2
Databricks
 
PPTX
Data Lakehouse Symposium | Day 4
Databricks
 
PDF
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Databricks
 
PDF
Democratizing Data Quality Through a Centralized Platform
Databricks
 
PDF
Learn to Use Databricks for Data Science
Databricks
 
PDF
Why APM Is Not the Same As ML Monitoring
Databricks
 
PDF
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
Databricks
 
PDF
Stage Level Scheduling Improving Big Data and AI Integration
Databricks
 
PDF
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
 
PDF
Scaling your Data Pipelines with Apache Spark on Kubernetes
Databricks
 
PDF
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Databricks
 
PDF
Sawtooth Windows for Feature Aggregations
Databricks
 
PDF
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Databricks
 
PDF
Re-imagine Data Monitoring with whylogs and Spark
Databricks
 
PDF
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
PDF
Processing Large Datasets for ADAS Applications using Apache Spark
Databricks
 
PDF
Massive Data Processing in Adobe Using Delta Lake
Databricks
 
DW Migration Webinar-March 2022.pptx
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Databricks
 
Data Lakehouse Symposium | Day 2
Databricks
 
Data Lakehouse Symposium | Day 4
Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Databricks
 
Democratizing Data Quality Through a Centralized Platform
Databricks
 
Learn to Use Databricks for Data Science
Databricks
 
Why APM Is Not the Same As ML Monitoring
Databricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
Databricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Databricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Databricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Databricks
 
Sawtooth Windows for Feature Aggregations
Databricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Databricks
 
Re-imagine Data Monitoring with whylogs and Spark
Databricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Databricks
 
Massive Data Processing in Adobe Using Delta Lake
Databricks
 
Ad

Recently uploaded (20)

PDF
Choosing the Right Database for Indexing.pdf
Tamanna
 
PPTX
Slide studies GC- CRC - PC - HNC baru.pptx
LLen8
 
DOCX
AI/ML Applications in Financial domain projects
Rituparna De
 
PDF
How to Avoid 7 Costly Mainframe Migration Mistakes
JP Infra Pvt Ltd
 
PPTX
Resmed Rady Landis May 4th - analytics.pptx
Adrian Limanto
 
PDF
R Cookbook - Processing and Manipulating Geological spatial data with R.pdf
OtnielSimopiaref2
 
PPTX
apidays Munich 2025 - Building an AWS Serverless Application with Terraform, ...
apidays
 
PDF
Performance Report Sample (Draft7).pdf
AmgadMaher5
 
PDF
Context Engineering vs. Prompt Engineering, A Comprehensive Guide.pdf
Tamanna
 
PPTX
apidays Helsinki & North 2025 - Vero APIs - Experiences of API development in...
apidays
 
PPTX
加拿大尼亚加拉学院毕业证书{Niagara在读证明信Niagara成绩单修改}复刻
Taqyea
 
PPT
Lecture 2-1.ppt at a higher learning institution such as the university of Za...
rachealhantukumane52
 
PDF
How to Connect Your On-Premises Site to AWS Using Site-to-Site VPN.pdf
Tamanna
 
PDF
Web Scraping with Google Gemini 2.0 .pdf
Tamanna
 
PPTX
DATA-COLLECTION METHODS, TYPES AND SOURCES
biggdaad011
 
DOC
MATRIX_AMAN IRAWAN_20227479046.docbbbnnb
vanitafiani1
 
PPTX
fashion industry boom.pptx an economics project
TGMPandeyji
 
PDF
apidays Helsinki & North 2025 - Monetizing AI APIs: The New API Economy, Alla...
apidays
 
PDF
apidays Helsinki & North 2025 - API-Powered Journeys: Mobility in an API-Driv...
apidays
 
PPTX
Module-5-Measures-of-Central-Tendency-Grouped-Data-1.pptx
lacsonjhoma0407
 
Choosing the Right Database for Indexing.pdf
Tamanna
 
Slide studies GC- CRC - PC - HNC baru.pptx
LLen8
 
AI/ML Applications in Financial domain projects
Rituparna De
 
How to Avoid 7 Costly Mainframe Migration Mistakes
JP Infra Pvt Ltd
 
Resmed Rady Landis May 4th - analytics.pptx
Adrian Limanto
 
R Cookbook - Processing and Manipulating Geological spatial data with R.pdf
OtnielSimopiaref2
 
apidays Munich 2025 - Building an AWS Serverless Application with Terraform, ...
apidays
 
Performance Report Sample (Draft7).pdf
AmgadMaher5
 
Context Engineering vs. Prompt Engineering, A Comprehensive Guide.pdf
Tamanna
 
apidays Helsinki & North 2025 - Vero APIs - Experiences of API development in...
apidays
 
加拿大尼亚加拉学院毕业证书{Niagara在读证明信Niagara成绩单修改}复刻
Taqyea
 
Lecture 2-1.ppt at a higher learning institution such as the university of Za...
rachealhantukumane52
 
How to Connect Your On-Premises Site to AWS Using Site-to-Site VPN.pdf
Tamanna
 
Web Scraping with Google Gemini 2.0 .pdf
Tamanna
 
DATA-COLLECTION METHODS, TYPES AND SOURCES
biggdaad011
 
MATRIX_AMAN IRAWAN_20227479046.docbbbnnb
vanitafiani1
 
fashion industry boom.pptx an economics project
TGMPandeyji
 
apidays Helsinki & North 2025 - Monetizing AI APIs: The New API Economy, Alla...
apidays
 
apidays Helsinki & North 2025 - API-Powered Journeys: Mobility in an API-Driv...
apidays
 
Module-5-Measures-of-Central-Tendency-Grouped-Data-1.pptx
lacsonjhoma0407
 

Using Apache Spark for Predicting Degrading and Failing Parts in Aviation

  • 1. UNCLASSIFIED // APPROVED FOR PUBLIC RELEASE 1 Boeing F/A-18F Super Hornet at takeoff at Danish Air Show 2014 on June 22 Image attribution: Slaunger CC BY-SA 3.0
  • 2. UNCLASSIFIED // APPROVED FOR PUBLIC RELEASE 2 The views of the author(s) expressed herein do not necessarily represent those of the U.S. Navy or Department of Defense (DoD). Presentation of this material does not constitute or imply its endorsement, recommendation, or favoring by the DoD.
  • 3. UNCLASSIFIED // APPROVED FOR PUBLIC RELEASE 3 • Team of 20 data scientists and software engineers • Navy civil servants and contractor support personnel • Naval Air Warfare Center Training Systems Division in Orlando, FL NAWCTSD Enterprise Research Data Science
  • 4. UNCLASSIFIED // APPROVED FOR PUBLIC RELEASE 4 Issue Unscheduled maintenance and component degradation impacting readiness Hornet Health Assessment and Readiness Tool (HhART) Create a real-time monitoring tool for a fleet of aircraft that is driven by engineering- approved features and models that can be used to assist maintenance decisions and predict component degradation
  • 5. UNCLASSIFIED // APPROVED FOR PUBLIC RELEASE 5 SME ETL Data science Deployment Feedback
  • 6. UNCLASSIFIED // APPROVED FOR PUBLIC RELEASE 6 SME ETL Data science Deployment Feedback • Engineers • Designers • Maintainers
  • 7. UNCLASSIFIED // APPROVED FOR PUBLIC RELEASE 7 SME ETL Data science Deployment Feedback • Engineers • Designers • Maintainers Engagement Education Communication Trust Culture EngineersMaintainers Leaders Data scientists Process engineers Developers
  • 8. UNCLASSIFIED // APPROVED FOR PUBLIC RELEASE 8 SME ETL Data science Deployment Feedback • Engineers • Designers • Maintainers SES 405 - Exploration Systems Engineering (ASU) System Hierarchy Module (9) • Complex system of systems • Data scientists can support at each level • Each component is unique
  • 9. UNCLASSIFIED // APPROVED FOR PUBLIC RELEASE 9 • Confounding effects • Inconsistent recording resolutions • Both discrete and continuous data • Changing schemas between software versions SME ETL Data science Deployment Feedback • Identify • Acquire • Load • Engineers • Designers • Maintainers
  • 10. UNCLASSIFIED // APPROVED FOR PUBLIC RELEASE 10 Batch analysis Streaming SME ETL Data science Deployment Feedback • Identify • Acquire • Load • Engineers • Designers • Maintainers Flight data Storage Warehouse Staging Analysis Cleaning Validation Analysis Results Models Flight data Storage Analysis Livedata Models Streaming Logic-based metrics Dashboard GitLab is a registered trademark of GitLab, Inc. Apache Kafka, Kafka, and the Kafka logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.
  • 11. UNCLASSIFIED // APPROVED FOR PUBLIC RELEASE 11 SME ETL Data science Deployment Feedback • Identify • Acquire • Load • Explore • Develop • Iterate • Engineers • Designers • Maintainers • Learn • Understand the problem domain • Analyze • Find indicators of the problem • Develop • Features to predict the indicators • Refine • Iterate with the SME to better define the features Iterate Learn Analyze Develop Refine
  • 12. UNCLASSIFIED // APPROVED FOR PUBLIC RELEASE 12 • Logic • Supply engineers with data science superpowers • Enhance error detection • Moderately predictive • Deep learning • Learn what normal behavior is • Detect complex parameter interaction • Highly predictive SME ETL Data science Deployment Feedback • Identify • Acquire • Load • Explore • Develop • Iterate • Engineers • Designers • Maintainers Images created using public NASA DashLink dataset
  • 13. UNCLASSIFIED // APPROVED FOR PUBLIC RELEASE 13 SME ETL Data science Deployment Feedback • Identify • Acquire • Load • Explore • Develop • Iterate • Engineers • Designers • Maintainers • Anomaly detection • Finding errant behavior in noisy signals • Virtual sensors • Replicating normal behavior • Information compression • Targeting specific interactions ModelInput ModelOutput Reconstruction Error Potential Anomalies
  • 14. UNCLASSIFIED // APPROVED FOR PUBLIC RELEASE 14 • Feature score normalization • Aggregate scores • Seeing trends SME ETL Data science Deployment Feedback • Engineers • Designers • Maintainers • Identify • Acquire • Load • Explore • Develop • Iterate • Live • Targeted • Relevant
  • 15. UNCLASSIFIED // APPROVED FOR PUBLIC RELEASE 15 SME ETL Data science Deployment Feedback • Engineers • Designers • Maintainers • Identify • Acquire • Load • Explore • Develop • Iterate • Live • Targeted • Relevant • Speed • On-prem vs. cloud • Automation • Security • Hardening containers • RMF, C-ATO • Platform One • AF CSO- software.af.mil • Live monitoring system Kubernetes and the Kubernetes logo are registered trademarks of The Linux Foundation.
  • 16. UNCLASSIFIED // APPROVED FOR PUBLIC RELEASE 16 • Critical at all stages of the process cycle • User needs should drive development • Tool should be accurate and explainable • Regularly communicate with users SME ETL Data science Deployment Feedback • Engineers • Designers • Maintainers • Identify • Acquire • Load • Explore • Develop • Iterate • Live • Targeted • Relevant • Communicate • Incorporate • Update
  • 17. UNCLASSIFIED // APPROVED FOR PUBLIC RELEASE 17 • Continuous • Tailored to new platforms and sub-systems • Adapted to meet the needs of the user Collaboration Learning Developing RefiningVisualizing Deploying Feedback
  • 18. UNCLASSIFIED // APPROVED FOR PUBLIC RELEASE 18 Lessons learned • Data science is most effective when tightly integrated into organizational structure • Progress is difficult at best without Data Scientist understanding of the system Near-term goals • Move to include additional platforms • Deploying HhART at the edge in collaboration with partners
  • 19. UNCLASSIFIED // APPROVED FOR PUBLIC RELEASE 19