SlideShare a Scribd company logo
Enterprise Data and Analytics
Architecture Overview
for
Electric Utility
Dr. Prajesh Bhattacharya
enSustain
Copyright enSustain
Summary
Copyright enSustain
1
The Overall Architecture
Copyright enSustain
Data Warehouse
(ONLY data required for high
performing production
reporting)
Enterprise Nomenclature
Data Lake
(ALL data)
Enterprise Nomenclature
Data Translation Layer (MDMS/EI)
Data Translation Layer (MDMS/EI)
DMS & OMS
Customer Data
Smart Meter
(metadata, readings)
Asset Data
(Location, config)
Financial
Data
Data Historian
SCADA
DG metadata
DG generation
data
HR Data
Usual EDW
process and
structure
Usual EDW
process and
structure
Discovery and
indexing/Tagging
Discovery and
indexing/Tagging
Production
Reports
Projects Data Explorers
(Engineers, Data scientists)
Weather Data
Misc. Sensor
head ends
Security
Data
Transmission
Planning Data
Maintenance
Data
Demand
Response Data
Transmission OE
and Dispatch Data
EMS
Transmission
Market Data
IT Asset Ops Data
IT Support Data
Project
Documents
Marketing &
Sales Data
Catch-All Other
Applications
Email &
chat logs
Facility
Data
Fleet
DataCopyright enSustain
Possible Point-To-Point Exceptions
Purpose oriented connection. Example:
oHistorian facilitates connection with SCADA
o EMS  SCADA connection is latency-sensitive
Application requiring access to only one system
oDMS applications running off of DMS data
o Historian applications running off of Historian data
Copyright enSustain
2
Implementation
Copyright enSustain
The Approach for Implementation
Main Challenges
Siloed data
Solution Part 1:
Standard Data
Model
Solution Part 2:
User view of unified
data
Lack of analytics
ideas
Solution:
Close partnership
between IT and
business
Lack of budget
Solution Part 1:
Tax each new
project
Solution Part 2:
Take baby steps
Copyright enSustain
Necessary Condition for Success
• At the beginning, implement the new mechanism, ONLY to serve the new
requirements
• Keep the existing connections working and unaffected
• Eventually, some of the existing connections will be deemed not-required, by the
business
• The rest of the existing connections can be converted as part of application
maintenance/overhaul/upgrade, but not in the beginning phase of the initiative
Do NOT touch the existing and the working systems first
• Do not try to implement all the necessary new components at once.
• Good quality on small scope is better than mediocre quality on large scope.
• It might require more overhead, but it is often worth.
Scope the smallest possible piece and do it well
Copyright enSustain
Possible Steps for Implementation
A new data
connectivity
requirement
comes in
Identify the
source system
Define the
enterprise
nomenclature
for the source
system to align
with industry
standard
Load
MDMS/EI with
the dictionary
Configure EI to
act as the data
virtualization
layer for the
source system
Release for
production
use with
appropriate
support
mechanism
Milestone: One project is now using this new mechanism for one source system
Repeat 1 for every new
data connectivity
request
As more source systems
are brought into the
scope, resolve
discrepancies, if any
arises
The virtualization layer
might experience
performance issue as
data load increases
Research and Plan the
Data Lake
For every new data source
implementation for the
virtualization, implement
the corresponding ETL for
Data Lake
Open the Data lake to
users that prefer getting
their data from the Data
Lake (delayed but faster)
over virtualization
Implement Data Lake
Analytics (say ML based
on Spark) for a single use
case
Copyright enSustain
3
MDMS, EI, Data Virtualization,
Data Warehouse
Copyright enSustain
Skip This Section
Most utilities already use these systems and are familiar with them
Copyright enSustain
Hence, we will not discuss them
For specific questions, please contact prajesh@ensustain.com
4
Data Lake
Copyright enSustain
Why the Data Lake?
• Some of the SOR systems might not be capable of handling as
much data request
• Access to some of the SOR systems might not be practical
• Implementation of data quality check on virtualized data is
hard (at the least, it would slow down queries)
• Data travel over network: larger in a virtualized environment
than in a Data Lake designed and used in a specific way
• Bottom line: go for Data Lake only if it is foreseen to be needed
If the MDMS/EI layer virtualizes the data, then access to standardized data across the
enterprise is already established.
What additional value does the Data Lake bring?
Data Lake – not the immediate need, but the eventual destination
Copyright enSustain
Data Lake: Market Offering Landscape
Copyright enSustain
Data Lake: Getting Data Into the Lake
Copyright enSustain
HDFS
Enterprise Data
 Shared across the company based on
security policy
 Fully managed and maintained
 Tight SLA
 100% Enterprise taxonomy based tagging
User data
 Results of ad-hoc analyses
 Some maintenance/control/SLA
 Folksonomy based tagging
Project/Group Data
 Enterprise standards might be too
restrictive to fulfill the requirements of the
project
 Shared among a handful of users
 Medium maintenance/control/SLA
 Folksonomy + some governance
Data
Governance
Tagging
Tool
MDMS
Data
Loader
Streaming
Data
Manual
Data
D
A
T
A
S
O
U
R
C
E
S
Hadoop Ecosystem Relevant To Utility
Copyright enSustain
H
D
F
S
YARN
Map-Reduce Application in Java
Hive
Spark Streaming
Spark SQL
Spark ML
Sqoop
Oozie Falcon
Hadoop native client
Storm
QueryIO
Waterline
Data
Attivio
Apache
Atlas
FUNCTIONALITY
COLOR LEGEND
Data Loading
Job Management
Data Governance
Data Reading
Map-Reduce
Data Storage
Data Lake
Vendor Solutions
5
Analytics
Copyright enSustain
Taxonomy 3
Taxonomy 2
Taxonomy 1
The Analytics Tool Landscape
Analytics
Tools
Production
Data Write-
back
Read-only
Project (semi-
production)
Data Write-
back
Read-only
Ad-hoc
Data Write-
back
Read-only
Analytics Tools
Managed
(Server based)
Unmanaged
(Desktop based)
Analytics
Tools
Coding heavy
Configuration
heavy
Copyright enSustain
Sample Analytics Opportunities …
Copyright enSustain
6
Appendix
Copyright enSustain
References
• http://ceur-ws.org/Vol-1497/PoEM2015_ShortPaper4.pdf
• http://smartgrid.epri.com/doc/Utility%20Enterprise%20Architecture%20Best%20Practices%20-
%20webcast.pdf
• http://www.navigantresearch.com/wordpress/wp-content/uploads/2011/10/SGEA-11-Brochure.pdf
• http://www.gridwiseac.org/pdfs/forum_papers/114_127_paper_final.pdf
• http://www.iec.ch/smartgrid/standards/
• https://www.boozallen.com/content/dam/boozallen/documents/Data_Lake.pdf
• Data Warehousing in the Age of Big Data, Krish Krishnan
• https://es.slideshare.net/hortonworks/hortonworks-and-waterline-data-webinar
• http://www.ibmbigdatahub.com/blog/charting-data-lake-rethinking-data-models-data-lakes
• https://www.slideshare.net/fabien_gandon/ontologies-in-computer-science-and-on-the-web
• http://www.ibmbigdatahub.com/blog/charting-data-lake-rethinking-data-models-data-lakes
• https://upside.tdwi.org/articles/2016/03/23/data-lake-become-swamp-1.aspx
• Many other sources
• Indigenous experiments
• Real-world experience
Copyright enSustain
Thank you!
Questions?
prajesh@ensustain.com
Copyright enSustain

More Related Content

What's hot (20)

PDF
Future Proofing Your IT Operating Model for Digital
David Favelle
 
PPTX
MLOps with Azure DevOps
Marco Parenzan
 
PDF
Request to Fulfill Presentation (IT4IT)
Rob Akershoek
 
PDF
Digital Operating Model & IT4IT
David Favelle
 
PDF
Digital Transformation And Enterprise Architecture
Alan McSweeney
 
PDF
Mainframe Application Modernization for Enterprise Developers
CA Technologies
 
PPTX
MLOps.pptx
AllenPeter7
 
PPTX
The future of AIOps
GAVS Technologies
 
PDF
IT-as-a-Service (ITaaS) - The New Business Model for IT
Scott Bils
 
PDF
MLOps Bridging the gap between Data Scientists and Ops.
Knoldus Inc.
 
PPTX
DevOps + DataOps = Digital Transformation
Delphix
 
PPTX
Introduction to Enterprise Architecture
Leo Shuster
 
PDF
Enterprise Cybersecurity: From Strategy to Operating Model
Eryk Budi Pratama
 
PDF
Life as a SRE at Instana
Marcel Birkner
 
PPTX
Cloud Center of Excellence
Jeremy Canale
 
PPTX
Taking Splunk to the Next Level - Architecture
Splunk
 
PPTX
Exploring Frameworks of Splunk Enterprise Security
Splunk
 
PPTX
From Data Science to MLOps
Carl W. Handlin
 
PPTX
A new IT Operating Model Emerges
David Favelle
 
PDF
Meetup: Streaming Data Pipeline Development
Timothy Spann
 
Future Proofing Your IT Operating Model for Digital
David Favelle
 
MLOps with Azure DevOps
Marco Parenzan
 
Request to Fulfill Presentation (IT4IT)
Rob Akershoek
 
Digital Operating Model & IT4IT
David Favelle
 
Digital Transformation And Enterprise Architecture
Alan McSweeney
 
Mainframe Application Modernization for Enterprise Developers
CA Technologies
 
MLOps.pptx
AllenPeter7
 
The future of AIOps
GAVS Technologies
 
IT-as-a-Service (ITaaS) - The New Business Model for IT
Scott Bils
 
MLOps Bridging the gap between Data Scientists and Ops.
Knoldus Inc.
 
DevOps + DataOps = Digital Transformation
Delphix
 
Introduction to Enterprise Architecture
Leo Shuster
 
Enterprise Cybersecurity: From Strategy to Operating Model
Eryk Budi Pratama
 
Life as a SRE at Instana
Marcel Birkner
 
Cloud Center of Excellence
Jeremy Canale
 
Taking Splunk to the Next Level - Architecture
Splunk
 
Exploring Frameworks of Splunk Enterprise Security
Splunk
 
From Data Science to MLOps
Carl W. Handlin
 
A new IT Operating Model Emerges
David Favelle
 
Meetup: Streaming Data Pipeline Development
Timothy Spann
 

Similar to Enterprise Data and Analytics Architecture Overview for Electric Utility (20)

PDF
DRAFT - Enterprise Data and Analytics Architecture Overview for Electric Utility
Prajesh Bhattacharya
 
PPTX
Navigating the World of User Data Management and Data Discovery
DataWorks Summit/Hadoop Summit
 
PDF
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dataconomy Media
 
PDF
(Ebook) Data Management at Scale: Best Practices for Enterprise Architecture ...
fleidlpkmas
 
PDF
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
StampedeCon
 
PDF
(Ebook) Data Management at Scale: Best Practices for Enterprise Architecture ...
ivkdlmdcf9313
 
PDF
Data Management at Scale Best Practices for Enterprise Architecture 1st Editi...
azlinisalow
 
PDF
Instant download Data Management at Scale Best Practices for Enterprise Archi...
soumoydjonga
 
PDF
Developing a successful big data business strategy
Sebastian Darrington
 
PDF
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Denodo
 
PDF
ExperiaSphere: Open-Source Management and Orchestration--Introduction
tnolle
 
PDF
Architecting Agile Data Applications for Scale
Databricks
 
PDF
Data Management at Scale Best Practices for Enterprise Architecture 1st Editi...
rilla9lendotq
 
PPTX
Exploiting Data Lakes: Architecture, Capabilities & Future
Agilisium Consulting
 
PPTX
Teradata Partners Conference Oct 2014 Big Data Anti-Patterns
Douglas Moore
 
PDF
Whitepaper-The-Data-Lake-3_0
Jane Roberts
 
PDF
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Denodo
 
PDF
EVOLVING PATTERNS IN BIG DATA - NEIL AVERY
Big Data Week
 
PDF
ISWC 2012 - Industry Track: "Linked Enterprise Data: leveraging the Semantic ...
Antidot
 
PDF
Modern Data Architecture
Mark Hewitt
 
DRAFT - Enterprise Data and Analytics Architecture Overview for Electric Utility
Prajesh Bhattacharya
 
Navigating the World of User Data Management and Data Discovery
DataWorks Summit/Hadoop Summit
 
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dataconomy Media
 
(Ebook) Data Management at Scale: Best Practices for Enterprise Architecture ...
fleidlpkmas
 
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
StampedeCon
 
(Ebook) Data Management at Scale: Best Practices for Enterprise Architecture ...
ivkdlmdcf9313
 
Data Management at Scale Best Practices for Enterprise Architecture 1st Editi...
azlinisalow
 
Instant download Data Management at Scale Best Practices for Enterprise Archi...
soumoydjonga
 
Developing a successful big data business strategy
Sebastian Darrington
 
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Denodo
 
ExperiaSphere: Open-Source Management and Orchestration--Introduction
tnolle
 
Architecting Agile Data Applications for Scale
Databricks
 
Data Management at Scale Best Practices for Enterprise Architecture 1st Editi...
rilla9lendotq
 
Exploiting Data Lakes: Architecture, Capabilities & Future
Agilisium Consulting
 
Teradata Partners Conference Oct 2014 Big Data Anti-Patterns
Douglas Moore
 
Whitepaper-The-Data-Lake-3_0
Jane Roberts
 
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Denodo
 
EVOLVING PATTERNS IN BIG DATA - NEIL AVERY
Big Data Week
 
ISWC 2012 - Industry Track: "Linked Enterprise Data: leveraging the Semantic ...
Antidot
 
Modern Data Architecture
Mark Hewitt
 
Ad

Recently uploaded (20)

PDF
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
PPTX
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
PDF
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
PDF
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PPTX
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PPTX
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
PDF
LOOPS in C Programming Language - Technology
RishabhDwivedi43
 
PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PDF
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
DOCX
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
PDF
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
LOOPS in C Programming Language - Technology
RishabhDwivedi43
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
Ad

Enterprise Data and Analytics Architecture Overview for Electric Utility

  • 1. Enterprise Data and Analytics Architecture Overview for Electric Utility Dr. Prajesh Bhattacharya enSustain Copyright enSustain
  • 4. Data Warehouse (ONLY data required for high performing production reporting) Enterprise Nomenclature Data Lake (ALL data) Enterprise Nomenclature Data Translation Layer (MDMS/EI) Data Translation Layer (MDMS/EI) DMS & OMS Customer Data Smart Meter (metadata, readings) Asset Data (Location, config) Financial Data Data Historian SCADA DG metadata DG generation data HR Data Usual EDW process and structure Usual EDW process and structure Discovery and indexing/Tagging Discovery and indexing/Tagging Production Reports Projects Data Explorers (Engineers, Data scientists) Weather Data Misc. Sensor head ends Security Data Transmission Planning Data Maintenance Data Demand Response Data Transmission OE and Dispatch Data EMS Transmission Market Data IT Asset Ops Data IT Support Data Project Documents Marketing & Sales Data Catch-All Other Applications Email & chat logs Facility Data Fleet DataCopyright enSustain
  • 5. Possible Point-To-Point Exceptions Purpose oriented connection. Example: oHistorian facilitates connection with SCADA o EMS  SCADA connection is latency-sensitive Application requiring access to only one system oDMS applications running off of DMS data o Historian applications running off of Historian data Copyright enSustain
  • 7. The Approach for Implementation Main Challenges Siloed data Solution Part 1: Standard Data Model Solution Part 2: User view of unified data Lack of analytics ideas Solution: Close partnership between IT and business Lack of budget Solution Part 1: Tax each new project Solution Part 2: Take baby steps Copyright enSustain
  • 8. Necessary Condition for Success • At the beginning, implement the new mechanism, ONLY to serve the new requirements • Keep the existing connections working and unaffected • Eventually, some of the existing connections will be deemed not-required, by the business • The rest of the existing connections can be converted as part of application maintenance/overhaul/upgrade, but not in the beginning phase of the initiative Do NOT touch the existing and the working systems first • Do not try to implement all the necessary new components at once. • Good quality on small scope is better than mediocre quality on large scope. • It might require more overhead, but it is often worth. Scope the smallest possible piece and do it well Copyright enSustain
  • 9. Possible Steps for Implementation A new data connectivity requirement comes in Identify the source system Define the enterprise nomenclature for the source system to align with industry standard Load MDMS/EI with the dictionary Configure EI to act as the data virtualization layer for the source system Release for production use with appropriate support mechanism Milestone: One project is now using this new mechanism for one source system Repeat 1 for every new data connectivity request As more source systems are brought into the scope, resolve discrepancies, if any arises The virtualization layer might experience performance issue as data load increases Research and Plan the Data Lake For every new data source implementation for the virtualization, implement the corresponding ETL for Data Lake Open the Data lake to users that prefer getting their data from the Data Lake (delayed but faster) over virtualization Implement Data Lake Analytics (say ML based on Spark) for a single use case Copyright enSustain
  • 10. 3 MDMS, EI, Data Virtualization, Data Warehouse Copyright enSustain
  • 11. Skip This Section Most utilities already use these systems and are familiar with them Copyright enSustain Hence, we will not discuss them For specific questions, please contact [email protected]
  • 13. Why the Data Lake? • Some of the SOR systems might not be capable of handling as much data request • Access to some of the SOR systems might not be practical • Implementation of data quality check on virtualized data is hard (at the least, it would slow down queries) • Data travel over network: larger in a virtualized environment than in a Data Lake designed and used in a specific way • Bottom line: go for Data Lake only if it is foreseen to be needed If the MDMS/EI layer virtualizes the data, then access to standardized data across the enterprise is already established. What additional value does the Data Lake bring? Data Lake – not the immediate need, but the eventual destination Copyright enSustain
  • 14. Data Lake: Market Offering Landscape Copyright enSustain
  • 15. Data Lake: Getting Data Into the Lake Copyright enSustain HDFS Enterprise Data  Shared across the company based on security policy  Fully managed and maintained  Tight SLA  100% Enterprise taxonomy based tagging User data  Results of ad-hoc analyses  Some maintenance/control/SLA  Folksonomy based tagging Project/Group Data  Enterprise standards might be too restrictive to fulfill the requirements of the project  Shared among a handful of users  Medium maintenance/control/SLA  Folksonomy + some governance Data Governance Tagging Tool MDMS Data Loader Streaming Data Manual Data D A T A S O U R C E S
  • 16. Hadoop Ecosystem Relevant To Utility Copyright enSustain H D F S YARN Map-Reduce Application in Java Hive Spark Streaming Spark SQL Spark ML Sqoop Oozie Falcon Hadoop native client Storm QueryIO Waterline Data Attivio Apache Atlas FUNCTIONALITY COLOR LEGEND Data Loading Job Management Data Governance Data Reading Map-Reduce Data Storage Data Lake Vendor Solutions
  • 18. Taxonomy 3 Taxonomy 2 Taxonomy 1 The Analytics Tool Landscape Analytics Tools Production Data Write- back Read-only Project (semi- production) Data Write- back Read-only Ad-hoc Data Write- back Read-only Analytics Tools Managed (Server based) Unmanaged (Desktop based) Analytics Tools Coding heavy Configuration heavy Copyright enSustain
  • 19. Sample Analytics Opportunities … Copyright enSustain
  • 21. References • http://ceur-ws.org/Vol-1497/PoEM2015_ShortPaper4.pdf • http://smartgrid.epri.com/doc/Utility%20Enterprise%20Architecture%20Best%20Practices%20- %20webcast.pdf • http://www.navigantresearch.com/wordpress/wp-content/uploads/2011/10/SGEA-11-Brochure.pdf • http://www.gridwiseac.org/pdfs/forum_papers/114_127_paper_final.pdf • http://www.iec.ch/smartgrid/standards/ • https://www.boozallen.com/content/dam/boozallen/documents/Data_Lake.pdf • Data Warehousing in the Age of Big Data, Krish Krishnan • https://es.slideshare.net/hortonworks/hortonworks-and-waterline-data-webinar • http://www.ibmbigdatahub.com/blog/charting-data-lake-rethinking-data-models-data-lakes • https://www.slideshare.net/fabien_gandon/ontologies-in-computer-science-and-on-the-web • http://www.ibmbigdatahub.com/blog/charting-data-lake-rethinking-data-models-data-lakes • https://upside.tdwi.org/articles/2016/03/23/data-lake-become-swamp-1.aspx • Many other sources • Indigenous experiments • Real-world experience Copyright enSustain