SlideShare a Scribd company logo
Sandy May & Richard Conway, Elastacloud
Azure Databricks
Monitoring Solar Farms
#SAISDD11
About Us
• Richard Conway Founder and Director of Elastacloud, a UK Cloud Data Analytics Consultancy, Azure
MVP/Microsoft Regional Director + Sandy May Cloud Big Data Surgeon
• Microsoft Azure Gold Partner, Cloud Platform and Data Analytics, OSS Partner of the year 2015,
Microsoft Partner of the Year nominee 2018
• Co-founder of UK Azure User Group, IoT and Data Science Innovators UK, UK Cloud Infrastructure User
Group
• Author of data science degree academy.microsoft.com
• Running AzureCraft in UK annually
• Contributors to open source, several Apache projects including Storm, Spark, Libcloud and Parquet
• 50+ people, offices in London, Nottingham and Spain
2#SAISDD11
What we’ll cover in this presentation
• A little about solar and renewables data
acquisition
• A little about Azure and how we design things
• Using Databricks in a fun way
• Orchestrating the future
3
The Solar Farm
4
images from mnn.com
5
Courtesy Wikipedia – installed UK capacity ~
< 10GW
Solar Irradiance
6
Inverter
W/m2
String
String
String
DC to AC National
Grid
Modbus Collection of Data
7
Inverter
Portal
Consumer
Logging
Get Data
TCP Port 502
Things to note:
- Inverter can generally only accept a
single Modbus TCP connection
- As such usage of Hub and Spoke
Models are needed to relay to cloud
- Modbus def example:
29;1;INV_EFFICIENCY;3;2;29;;;0.1;0;%;2
Register No. Min/Max/Avg Units
Msg Type
Per minute
Redundant
Consumer
TCP Port xxx
Cloud Gateway Patterns from Farm
8
Inverter Modbus
Collector
ServerPyranometer
Weather
station
Generally
terrible
router
Cloud
Gateway
Export Meter
Event Hub
Message
Stream
Low spec
AWS Server
Data
Orchestration
API with 24 hour lag
Event Hub
Message
Stream
Takes around 8-9 hours!
All sources of information
9
Farm
weatherbit.io
darksky
openweather
Met office
cloud cover
suntimes
Azure
Balancing Energy
10
production
time - intraday
shortfall
surplus
- Shortfalls lead to buy
back to fulfil contracts
- Surpluses need selling
- Grid needs to be
balanced at all times
- Need to understand
intraday pricing market
to understand exposure
- Insure risk through
PPAs for lower returns
- National Grid has
balancing cost
- Internal transactions
need balancing too
- Need to understand
interconnects
Our Azure Architecture
11
Data Bricks
Event Hubs
Scada API for
Actual Power &
Recorded
Irradiance
Weather API’s
(MetOffice,
DarkSky &
WeatherBit)
5 minute weather data streamed for over 1,200 solar
farms in Europe; data used to create Solar Irradiance
predictions, and Power output for farms using service
Actuals collected for
model training and
evaluation
Predictions and model
performance recorded in Azure
SQL Data Warehouse and
reported on using Power BI
Demos and Code!
12
Thanks and Questions?
13

More Related Content

What's hot (20)

PPTX
Zero Downtime App Deployment using Hadoop
DataWorks Summit/Hadoop Summit
 
PPTX
Instrumenting your Instruments
DataWorks Summit/Hadoop Summit
 
PPTX
Transforms Document Management at Scale with Distributed Database Solution wi...
DataStax Academy
 
PDF
Data Driven Decisions at Scale
Databricks
 
PDF
Industrial production process visualization with the Elastic Stack in real-ti...
Elasticsearch
 
PDF
Analytics at the Real-Time Speed of Business: Spark Summit East talk by Manis...
Spark Summit
 
PDF
Intro to databricks delta lake
Mykola Zerniuk
 
PDF
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
Spark Summit
 
PPTX
DataStax Enterprise in Practice (Field Notes)
DataStax
 
PPTX
Getting It Right Exactly Once: Principles for Streaming Architectures
SingleStore
 
PPTX
Reblaze Case Study on GCP
Idan Tohami
 
PDF
Leveraging Spark to Democratize Data for Omni-Commerce with Shafaq Abdullah
Databricks
 
PPTX
Best Practices for Supercharging Cloud Analytics on Amazon Redshift
SnapLogic
 
PDF
Building the Next-gen Digital Meter Platform for Fluvius
Databricks
 
PDF
Generative Hyperloop Design: Managing Massively Scaled Simulations Focused on...
Databricks
 
PDF
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
MSAdvAnalytics
 
PPTX
Digitalising the Core – How Analytics is Shaping the Energy Industry Daniel J...
Spark Summit
 
PDF
Streaming Customer Insights with DataStax Cassandra & Apache Kafta at British...
DataStax
 
PDF
Leveraging Apache Spark to Develop AI-Enabled Products and Services at Bosch
Databricks
 
PDF
Anomaly Detection at Scale!
Databricks
 
Zero Downtime App Deployment using Hadoop
DataWorks Summit/Hadoop Summit
 
Instrumenting your Instruments
DataWorks Summit/Hadoop Summit
 
Transforms Document Management at Scale with Distributed Database Solution wi...
DataStax Academy
 
Data Driven Decisions at Scale
Databricks
 
Industrial production process visualization with the Elastic Stack in real-ti...
Elasticsearch
 
Analytics at the Real-Time Speed of Business: Spark Summit East talk by Manis...
Spark Summit
 
Intro to databricks delta lake
Mykola Zerniuk
 
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
Spark Summit
 
DataStax Enterprise in Practice (Field Notes)
DataStax
 
Getting It Right Exactly Once: Principles for Streaming Architectures
SingleStore
 
Reblaze Case Study on GCP
Idan Tohami
 
Leveraging Spark to Democratize Data for Omni-Commerce with Shafaq Abdullah
Databricks
 
Best Practices for Supercharging Cloud Analytics on Amazon Redshift
SnapLogic
 
Building the Next-gen Digital Meter Platform for Fluvius
Databricks
 
Generative Hyperloop Design: Managing Massively Scaled Simulations Focused on...
Databricks
 
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
MSAdvAnalytics
 
Digitalising the Core – How Analytics is Shaping the Energy Industry Daniel J...
Spark Summit
 
Streaming Customer Insights with DataStax Cassandra & Apache Kafta at British...
DataStax
 
Leveraging Apache Spark to Develop AI-Enabled Products and Services at Bosch
Databricks
 
Anomaly Detection at Scale!
Databricks
 

Similar to Using Azure Databricks, Structured Streaming, and Deep Learning Pipelines to Monitor 1,000+ Solar Farms in Real Time with Sandy May and Richard Conway (20)

PDF
OCCIware@CloudExpoLondon2017 - an extensible, standard XaaS Cloud consumer pl...
Marc Dutoo
 
PDF
Extensible and Standard-based XaaS Platform To Manage Everything in The Cloud...
OCCIware
 
PPTX
Accelerating a Path to Digital with a Cloud Data Strategy
MongoDB
 
PDF
AWS BaseCamp: AWS Architecture Fundamentals
Nicole Maus
 
PPTX
GREEN CLOUD COMPUTING
JauwadSyed
 
PPTX
Accelerating a Path to Digital With a Cloud Data Strategy
MongoDB
 
PDF
Fabio Cecaro - SMAU Napoli 2017
SMAU
 
PDF
Citi Tech Talk: Hybrid Cloud
confluent
 
PDF
Webinar Slides: MySQL Data Protection: Medical SaaS Manages Sensitive HIPAA C...
Continuent
 
PDF
CCCNA17 Introduction
ShapeBlue
 
PPTX
Faster, Simpler, Better - MongoDB to the rescue
MongoDB
 
PPTX
Slide share device to iot solution – a blueprint
Guy Vinograd ☁
 
PDF
OCCIware @ Paris Open Source Summit 2017 - a standard, extensible Cloud consu...
Marc Dutoo
 
PDF
Presentation of OCCIware, a standard, extensible Cloud consumer platform at P...
OCCIware
 
PDF
Aws Architecture Fundamentals
2nd Watch
 
PDF
#OSSPARIS17 - Développeurs, urbanisez la consommation de vos Clouds et APIs a...
Paris Open Source Summit
 
PDF
OCCIware@POSS 2016 - an extensible, standard XaaS cloud consumer platform
Marc Dutoo
 
PPTX
Green cloud computing
madhurisalvakam
 
PPTX
Introduction to cloud computing
PUBLEAD (R)
 
PPTX
Connectivité temps réel et bi-directionnelle ​ pour solutions IOT
Solace
 
OCCIware@CloudExpoLondon2017 - an extensible, standard XaaS Cloud consumer pl...
Marc Dutoo
 
Extensible and Standard-based XaaS Platform To Manage Everything in The Cloud...
OCCIware
 
Accelerating a Path to Digital with a Cloud Data Strategy
MongoDB
 
AWS BaseCamp: AWS Architecture Fundamentals
Nicole Maus
 
GREEN CLOUD COMPUTING
JauwadSyed
 
Accelerating a Path to Digital With a Cloud Data Strategy
MongoDB
 
Fabio Cecaro - SMAU Napoli 2017
SMAU
 
Citi Tech Talk: Hybrid Cloud
confluent
 
Webinar Slides: MySQL Data Protection: Medical SaaS Manages Sensitive HIPAA C...
Continuent
 
CCCNA17 Introduction
ShapeBlue
 
Faster, Simpler, Better - MongoDB to the rescue
MongoDB
 
Slide share device to iot solution – a blueprint
Guy Vinograd ☁
 
OCCIware @ Paris Open Source Summit 2017 - a standard, extensible Cloud consu...
Marc Dutoo
 
Presentation of OCCIware, a standard, extensible Cloud consumer platform at P...
OCCIware
 
Aws Architecture Fundamentals
2nd Watch
 
#OSSPARIS17 - Développeurs, urbanisez la consommation de vos Clouds et APIs a...
Paris Open Source Summit
 
OCCIware@POSS 2016 - an extensible, standard XaaS cloud consumer platform
Marc Dutoo
 
Green cloud computing
madhurisalvakam
 
Introduction to cloud computing
PUBLEAD (R)
 
Connectivité temps réel et bi-directionnelle ​ pour solutions IOT
Solace
 
Ad

More from Databricks (20)

PPTX
DW Migration Webinar-March 2022.pptx
Databricks
 
PPTX
Data Lakehouse Symposium | Day 1 | Part 1
Databricks
 
PPT
Data Lakehouse Symposium | Day 1 | Part 2
Databricks
 
PPTX
Data Lakehouse Symposium | Day 2
Databricks
 
PPTX
Data Lakehouse Symposium | Day 4
Databricks
 
PDF
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Databricks
 
PDF
Democratizing Data Quality Through a Centralized Platform
Databricks
 
PDF
Learn to Use Databricks for Data Science
Databricks
 
PDF
Why APM Is Not the Same As ML Monitoring
Databricks
 
PDF
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
Databricks
 
PDF
Stage Level Scheduling Improving Big Data and AI Integration
Databricks
 
PDF
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
 
PDF
Scaling your Data Pipelines with Apache Spark on Kubernetes
Databricks
 
PDF
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Databricks
 
PDF
Sawtooth Windows for Feature Aggregations
Databricks
 
PDF
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Databricks
 
PDF
Re-imagine Data Monitoring with whylogs and Spark
Databricks
 
PDF
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
PDF
Processing Large Datasets for ADAS Applications using Apache Spark
Databricks
 
PDF
Massive Data Processing in Adobe Using Delta Lake
Databricks
 
DW Migration Webinar-March 2022.pptx
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Databricks
 
Data Lakehouse Symposium | Day 2
Databricks
 
Data Lakehouse Symposium | Day 4
Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Databricks
 
Democratizing Data Quality Through a Centralized Platform
Databricks
 
Learn to Use Databricks for Data Science
Databricks
 
Why APM Is Not the Same As ML Monitoring
Databricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
Databricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Databricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Databricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Databricks
 
Sawtooth Windows for Feature Aggregations
Databricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Databricks
 
Re-imagine Data Monitoring with whylogs and Spark
Databricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Databricks
 
Massive Data Processing in Adobe Using Delta Lake
Databricks
 
Ad

Recently uploaded (20)

PPTX
apidays Munich 2025 - Building an AWS Serverless Application with Terraform, ...
apidays
 
PPTX
Rocket-Launched-PowerPoint-Template.pptx
Arden31
 
PPTX
Introduction to Artificial Intelligence.pptx
StarToon1
 
PPT
Data base management system Transactions.ppt
gandhamcharan2006
 
PDF
Product Management in HealthTech (Case Studies from SnappDoctor)
Hamed Shams
 
PPTX
GenAI-Introduction-to-Copilot-for-Bing-March-2025-FOR-HUB.pptx
cleydsonborges1
 
PPTX
Advanced_NLP_with_Transformers_PPT_final 50.pptx
Shiwani Gupta
 
DOC
MATRIX_AMAN IRAWAN_20227479046.docbbbnnb
vanitafiani1
 
PDF
apidays Helsinki & North 2025 - APIs in the healthcare sector: hospitals inte...
apidays
 
PDF
Incident Response and Digital Forensics Certificate
VICTOR MAESTRE RAMIREZ
 
PDF
Early_Diabetes_Detection_using_Machine_L.pdf
maria879693
 
PDF
R Cookbook - Processing and Manipulating Geological spatial data with R.pdf
OtnielSimopiaref2
 
PDF
Web Scraping with Google Gemini 2.0 .pdf
Tamanna
 
PDF
Data Chunking Strategies for RAG in 2025.pdf
Tamanna
 
PDF
apidays Helsinki & North 2025 - How (not) to run a Graphql Stewardship Group,...
apidays
 
PDF
Building Production-Ready AI Agents with LangGraph.pdf
Tamanna
 
PDF
Context Engineering vs. Prompt Engineering, A Comprehensive Guide.pdf
Tamanna
 
PPTX
recruitment Presentation.pptxhdhshhshshhehh
devraj40467
 
PDF
What does good look like - CRAP Brighton 8 July 2025
Jan Kierzyk
 
PDF
WEF_Future_of_Global_Fintech_Second_Edition_2025.pdf
AproximacionAlFuturo
 
apidays Munich 2025 - Building an AWS Serverless Application with Terraform, ...
apidays
 
Rocket-Launched-PowerPoint-Template.pptx
Arden31
 
Introduction to Artificial Intelligence.pptx
StarToon1
 
Data base management system Transactions.ppt
gandhamcharan2006
 
Product Management in HealthTech (Case Studies from SnappDoctor)
Hamed Shams
 
GenAI-Introduction-to-Copilot-for-Bing-March-2025-FOR-HUB.pptx
cleydsonborges1
 
Advanced_NLP_with_Transformers_PPT_final 50.pptx
Shiwani Gupta
 
MATRIX_AMAN IRAWAN_20227479046.docbbbnnb
vanitafiani1
 
apidays Helsinki & North 2025 - APIs in the healthcare sector: hospitals inte...
apidays
 
Incident Response and Digital Forensics Certificate
VICTOR MAESTRE RAMIREZ
 
Early_Diabetes_Detection_using_Machine_L.pdf
maria879693
 
R Cookbook - Processing and Manipulating Geological spatial data with R.pdf
OtnielSimopiaref2
 
Web Scraping with Google Gemini 2.0 .pdf
Tamanna
 
Data Chunking Strategies for RAG in 2025.pdf
Tamanna
 
apidays Helsinki & North 2025 - How (not) to run a Graphql Stewardship Group,...
apidays
 
Building Production-Ready AI Agents with LangGraph.pdf
Tamanna
 
Context Engineering vs. Prompt Engineering, A Comprehensive Guide.pdf
Tamanna
 
recruitment Presentation.pptxhdhshhshshhehh
devraj40467
 
What does good look like - CRAP Brighton 8 July 2025
Jan Kierzyk
 
WEF_Future_of_Global_Fintech_Second_Edition_2025.pdf
AproximacionAlFuturo
 

Using Azure Databricks, Structured Streaming, and Deep Learning Pipelines to Monitor 1,000+ Solar Farms in Real Time with Sandy May and Richard Conway

  • 1. Sandy May & Richard Conway, Elastacloud Azure Databricks Monitoring Solar Farms #SAISDD11
  • 2. About Us • Richard Conway Founder and Director of Elastacloud, a UK Cloud Data Analytics Consultancy, Azure MVP/Microsoft Regional Director + Sandy May Cloud Big Data Surgeon • Microsoft Azure Gold Partner, Cloud Platform and Data Analytics, OSS Partner of the year 2015, Microsoft Partner of the Year nominee 2018 • Co-founder of UK Azure User Group, IoT and Data Science Innovators UK, UK Cloud Infrastructure User Group • Author of data science degree academy.microsoft.com • Running AzureCraft in UK annually • Contributors to open source, several Apache projects including Storm, Spark, Libcloud and Parquet • 50+ people, offices in London, Nottingham and Spain 2#SAISDD11
  • 3. What we’ll cover in this presentation • A little about solar and renewables data acquisition • A little about Azure and how we design things • Using Databricks in a fun way • Orchestrating the future 3
  • 4. The Solar Farm 4 images from mnn.com
  • 5. 5 Courtesy Wikipedia – installed UK capacity ~ < 10GW
  • 7. Modbus Collection of Data 7 Inverter Portal Consumer Logging Get Data TCP Port 502 Things to note: - Inverter can generally only accept a single Modbus TCP connection - As such usage of Hub and Spoke Models are needed to relay to cloud - Modbus def example: 29;1;INV_EFFICIENCY;3;2;29;;;0.1;0;%;2 Register No. Min/Max/Avg Units Msg Type Per minute Redundant Consumer TCP Port xxx
  • 8. Cloud Gateway Patterns from Farm 8 Inverter Modbus Collector ServerPyranometer Weather station Generally terrible router Cloud Gateway Export Meter Event Hub Message Stream Low spec AWS Server Data Orchestration API with 24 hour lag Event Hub Message Stream Takes around 8-9 hours!
  • 9. All sources of information 9 Farm weatherbit.io darksky openweather Met office cloud cover suntimes Azure
  • 10. Balancing Energy 10 production time - intraday shortfall surplus - Shortfalls lead to buy back to fulfil contracts - Surpluses need selling - Grid needs to be balanced at all times - Need to understand intraday pricing market to understand exposure - Insure risk through PPAs for lower returns - National Grid has balancing cost - Internal transactions need balancing too - Need to understand interconnects
  • 11. Our Azure Architecture 11 Data Bricks Event Hubs Scada API for Actual Power & Recorded Irradiance Weather API’s (MetOffice, DarkSky & WeatherBit) 5 minute weather data streamed for over 1,200 solar farms in Europe; data used to create Solar Irradiance predictions, and Power output for farms using service Actuals collected for model training and evaluation Predictions and model performance recorded in Azure SQL Data Warehouse and reported on using Power BI