SlideShare a Scribd company logo
Azure Data Lake:
What is it? Why is it?
Where is it?
EUGENE POLONICHKO
DATA PLATFORM MVP
BIDWH ARCHITECT
About me
Eugene Polonichko has over 7 years of experience
with SQL Server. He mainly focused on BI projects
(SSAS, SSIS, PowerBI, Cognos, Informatica
PowerCenter, Pentaho, Tableau). Eugene is a
passionate speaker and SQL community volunteer
presenting regularly at PASS SQL Saturday events
and local user groups around Ukraine and Europe.
Eugene is PASS Chapter Leader and he has a status
MVP Data Platform
https://www.linkedin.com/in/eugenepolonichko/
https://twitter.com/EvgenPolonichko
Agenda
 What is Data Lake?
 Architecture of Azure Data Lake
 Azure Data Lake Store
 Overview of Azure Data Lake Store
 Compare
 For big data processing
 Azure Data Lake Analytics
 U-SQL
 Concepts
 U-SQL Script Structure
 Extractors
 U-SQL Jobs
 U-SQL catalog
 Monitoring and performance U-SQL jobs
 Data Lake Analytics pricing
Data Lake
Data Lake
Architecture of Azure Data Lake
Azure Data Lake Stores
 Azure Data Lake Store is a hyper-scale repository for big data analytic workloads.
Azure Data Lake enables you to capture data of any size, type, and ingestion speed
in one single place for operational and exploratory analytics.
 The Azure Data Lake store is an Apache Hadoop file system compatible with
Hadoop Distributed File System (HDFS)
 Can be accessed from Hadoop (available with HDInsight cluster) using the
WebHDFS-compatible REST APIs
Azure Data Lake Stores
Use Cases
 Store social media
posts, log files, sensor
data
 Store corporate data
such as
relational databases
(as flat files)
Data Lake Storage vs Azure Storage
Optimized storage for big
data analytics workloads
General purpose object
store for a wide variety of
storage scenarios
Batch, interactive, streaming
analytics, log files and etc
Any type of text or binary
data, such as application
back end,
account contains folders, which
in turn contains data stored as
files
Storage account has
containers
Optimized performance for
parallel analytics workloads. High
Throughput and IOPS.
Not optimized for
analytics workloads
Big Data requirements
Pricing
Transaction prices
Storage prices
DEMO
Azure Data Lake Analytics
Azure Data Lake Analytics is an on-demand analytics job service to simplify big data analytics. You
can focus on writing, running, and managing jobs rather than on operating distributed
infrastructure.
 Dynamic scaling
 Develop faster, debug, and optimize smarter using familiar tools
 Affordable and cost effective
 Works with all your Azure Data
 U-SQL: simple and familiar, powerful, and extensible
U-SQL
T-SQL C#
U-SQL
Concepts
Retrieve data from stored
locations in rowset format
Transform the rowset(s)
Transform the rowset(s)
U-SQL Script Structure
Script :=
Statement_List.
Statement_List :=
{ [Statement] ';' }.
Statement :=
Use_Statement
| If_Else_Statement
| Declare_Variable_Statement
| Reference_Assembly_Statement
| Deploy_Resource_Statement
| DDL_Statement
| Query_Statement
| Procedure_Call
| Import_Package_Statement
| DML_Statement
| Output_Statement.
U-SQL Script Structure
U-SQL Built-in Extractors:
 Extractors.Text() :
 Extractors.Csv()
 Extractors.Tsv()
Extractors
U-SQL Jobs
UNIT
V--
V--
V—
V---
V--
V--
ADLAUs
U-SQL Jobs
ADLAUs
Azure
Data
Lake
Analytics
Unit
Parallelism N = N ADLAUs
1 ADLAU ~=
A VM with 2 cores and 6
GB of memory
U-SQL Jobs
U-SQL Catalog
Database
Table
Views
Procedures
DEMO
Monitoring
1 Azure Portal
Monitoring
Visual Studio
DEMO
Pricing
Links
 http://www.sqlservercentral.com/stairway/142480/
 https://azure.microsoft.com/en-us/solutions/data-lake/
Questions?
Thank you

More Related Content

PPTX
Building Modern Data Platform with Microsoft Azure
Dmitry Anoshin
 
PPTX
Digital Transformation with Microsoft Azure
Luan Moreno Medeiros Maciel
 
PPTX
RDX Insights Presentation - Microsoft Business Intelligence
Christopher Foot
 
PPTX
Azure data factory
BizTalk360
 
PPTX
Azure Data Lake Intro (SQLBits 2016)
Michael Rys
 
PPTX
Running cost effective big data workloads with Azure Synapse and Azure Data L...
Michael Rys
 
PPTX
Big Data in Azure
DataWorks Summit/Hadoop Summit
 
PPTX
Anatomy of a data driven architecture - Tamir Dresher
Tamir Dresher
 
Building Modern Data Platform with Microsoft Azure
Dmitry Anoshin
 
Digital Transformation with Microsoft Azure
Luan Moreno Medeiros Maciel
 
RDX Insights Presentation - Microsoft Business Intelligence
Christopher Foot
 
Azure data factory
BizTalk360
 
Azure Data Lake Intro (SQLBits 2016)
Michael Rys
 
Running cost effective big data workloads with Azure Synapse and Azure Data L...
Michael Rys
 
Anatomy of a data driven architecture - Tamir Dresher
Tamir Dresher
 

What's hot (20)

PPTX
Introduction to Azure Databricks
James Serra
 
PDF
The Warranty Data Lake – After, Inc.
Richard Vermillion
 
PDF
Introduction to Azure Data Lake
Antonios Chatzipavlis
 
PPTX
Data lake – On Premise VS Cloud
Idan Tohami
 
PPTX
Azure Lowlands: An intro to Azure Data Lake
Rick van den Bosch
 
PPTX
Big Data Analytics in the Cloud with Microsoft Azure
Mark Kromer
 
PPTX
Data Lake Overview
James Serra
 
PPTX
Azure Synapse Analytics Overview (r1)
James Serra
 
PPTX
Database awareness
kloia
 
PPTX
Running cost effective big data workloads with Azure Synapse and ADLS (MS Ign...
Michael Rys
 
PDF
Azure Data Factory V2; The Data Flows
Thomas Sykes
 
PDF
Building a Data Lake on AWS
Gary Stafford
 
PDF
Dipping Your Toes: Azure Data Lake for DBAs
Bob Pusateri
 
PDF
Data Lakes with Azure Databricks
Data Con LA
 
PPTX
Architecting a datalake
Laurent Leturgez
 
PDF
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Hortonworks
 
PDF
Data warehouse con azure synapse analytics
Eduardo Castro
 
PPTX
Webinar - Introduction to Azure Data Lake
Josh Lane
 
PPTX
Modern data warehouse
Rakesh Jayaram
 
PDF
How to Build Modern Data Architectures Both On Premises and in the Cloud
VMware Tanzu
 
Introduction to Azure Databricks
James Serra
 
The Warranty Data Lake – After, Inc.
Richard Vermillion
 
Introduction to Azure Data Lake
Antonios Chatzipavlis
 
Data lake – On Premise VS Cloud
Idan Tohami
 
Azure Lowlands: An intro to Azure Data Lake
Rick van den Bosch
 
Big Data Analytics in the Cloud with Microsoft Azure
Mark Kromer
 
Data Lake Overview
James Serra
 
Azure Synapse Analytics Overview (r1)
James Serra
 
Database awareness
kloia
 
Running cost effective big data workloads with Azure Synapse and ADLS (MS Ign...
Michael Rys
 
Azure Data Factory V2; The Data Flows
Thomas Sykes
 
Building a Data Lake on AWS
Gary Stafford
 
Dipping Your Toes: Azure Data Lake for DBAs
Bob Pusateri
 
Data Lakes with Azure Databricks
Data Con LA
 
Architecting a datalake
Laurent Leturgez
 
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Hortonworks
 
Data warehouse con azure synapse analytics
Eduardo Castro
 
Webinar - Introduction to Azure Data Lake
Josh Lane
 
Modern data warehouse
Rakesh Jayaram
 
How to Build Modern Data Architectures Both On Premises and in the Cloud
VMware Tanzu
 
Ad

Viewers also liked (8)

PPTX
Dive Into Azure Data Lake - PASS 2017
Ike Ellis
 
PPT
Memcache
Abhinav Singh
 
PPTX
Top Trends in Building Data Lakes for Machine Learning and AI
Holden Ackerman
 
PPTX
CWIN17 India / Insights platform architecture v1 0 virtual - subhadeep dutta
Capgemini
 
PDF
The Future of Data
blynnbuckley
 
PPTX
Big data architectures and the data lake
James Serra
 
PPTX
50 Shades of Data - how, when and why Big,Relational,NoSQL,Elastic,Event,CQRS...
Lucas Jellema
 
PDF
A beginners guide to Cloudera Hadoop
David Yahalom
 
Dive Into Azure Data Lake - PASS 2017
Ike Ellis
 
Memcache
Abhinav Singh
 
Top Trends in Building Data Lakes for Machine Learning and AI
Holden Ackerman
 
CWIN17 India / Insights platform architecture v1 0 virtual - subhadeep dutta
Capgemini
 
The Future of Data
blynnbuckley
 
Big data architectures and the data lake
James Serra
 
50 Shades of Data - how, when and why Big,Relational,NoSQL,Elastic,Event,CQRS...
Lucas Jellema
 
A beginners guide to Cloudera Hadoop
David Yahalom
 
Ad

Similar to Ai big dataconference_eugene_polonichko_azure data lake (20)

PPTX
NDC Minnesota - Analyzing StackExchange data with Azure Data Lake
Tom Kerkhove
 
PDF
USQ Landdemos Azure Data Lake
Trivadis
 
PPTX
Move your on prem data to a lake in a Lake in Cloud
CAMMS
 
PDF
J1 T1 3 - Azure Data Lake store & analytics 101 - Kenneth M. Nielsen
MS Cloud Summit
 
PDF
Talavant Data Lake Analytics
Sean Forgatch
 
PPTX
Azure data lake sql konf 2016
Kenneth Michael Nielsen
 
PPTX
NDC Sydney - Analyzing StackExchange with Azure Data Lake
Tom Kerkhove
 
PPTX
An intro to Azure Data Lake
Rick van den Bosch
 
PPTX
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
Michael Rys
 
PPTX
Introducing U-SQL (SQLPASS 2016)
Michael Rys
 
PPTX
Azure Data Lake and Azure Data Lake Analytics
Waqas Idrees
 
PPTX
Designing big data analytics solutions on azure
Mohamed Tawfik
 
PDF
Trivadis Azure Data Lake
Trivadis
 
PPTX
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)
Michael Rys
 
PDF
Auckland SQL Saturday - Azure Data Lake
Sergio Zenatti Filho
 
PPTX
PASS_Summit_2019_Azure_Storage_Options_for_Analytics
Dustin Vannoy
 
PDF
Complete Guide to Microsoft Azure Data Lake.pdf
microteklearningss
 
PPTX
ADL/U-SQL Introduction (SQLBits 2016)
Michael Rys
 
PPTX
Azure data lakes
Vishwas N
 
PPTX
Modern ETL: Azure Data Factory, Data Lake, and SQL Database
Eric Bragas
 
NDC Minnesota - Analyzing StackExchange data with Azure Data Lake
Tom Kerkhove
 
USQ Landdemos Azure Data Lake
Trivadis
 
Move your on prem data to a lake in a Lake in Cloud
CAMMS
 
J1 T1 3 - Azure Data Lake store & analytics 101 - Kenneth M. Nielsen
MS Cloud Summit
 
Talavant Data Lake Analytics
Sean Forgatch
 
Azure data lake sql konf 2016
Kenneth Michael Nielsen
 
NDC Sydney - Analyzing StackExchange with Azure Data Lake
Tom Kerkhove
 
An intro to Azure Data Lake
Rick van den Bosch
 
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
Michael Rys
 
Introducing U-SQL (SQLPASS 2016)
Michael Rys
 
Azure Data Lake and Azure Data Lake Analytics
Waqas Idrees
 
Designing big data analytics solutions on azure
Mohamed Tawfik
 
Trivadis Azure Data Lake
Trivadis
 
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)
Michael Rys
 
Auckland SQL Saturday - Azure Data Lake
Sergio Zenatti Filho
 
PASS_Summit_2019_Azure_Storage_Options_for_Analytics
Dustin Vannoy
 
Complete Guide to Microsoft Azure Data Lake.pdf
microteklearningss
 
ADL/U-SQL Introduction (SQLBits 2016)
Michael Rys
 
Azure data lakes
Vishwas N
 
Modern ETL: Azure Data Factory, Data Lake, and SQL Database
Eric Bragas
 

More from Olga Zinkevych (15)

PDF
Overview of text classification approaches algorithms & software v lyubin...
Olga Zinkevych
 
PPTX
Evolution of words through time a malenko dataconf 21 04_18
Olga Zinkevych
 
PPTX
What it takes to build a model for detecting patients that defaults from medi...
Olga Zinkevych
 
PDF
Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18
Olga Zinkevych
 
PDF
Dataservices based on mesos and kafka kostiantyn bokhan dataconf 21 04 18
Olga Zinkevych
 
PPTX
Azure data catalog your data your way eugene polonichko dataconf 21 04 18
Olga Zinkevych
 
PDF
Aibdconference chat bot for every product Maksym Volchenko
Olga Zinkevych
 
PDF
Ai&bigdataconference oleksandr saienko machine learning use cases in telecom
Olga Zinkevych
 
PPTX
Ai big dataconference_volodymyr getmanskyi colorization distance measuring
Olga Zinkevych
 
PDF
Ai big dataconference_taras firman how to build advanced prediction with addi...
Olga Zinkevych
 
PPTX
Ai big dataconference_sparkinonehour_vitalii bashun
Olga Zinkevych
 
PPTX
Ai big dataconference_semantic image segmentatation using word embeddings_ole...
Olga Zinkevych
 
PPTX
Ai big dataconference_ml_fastdata_vitalii bondarenko
Olga Zinkevych
 
PPTX
Ai big dataconference_krakovetskyi_microsoft ai a new era of smart solutions
Olga Zinkevych
 
PPTX
Ai big dataconference_jeffrey ricker_kappa_architecture
Olga Zinkevych
 
Overview of text classification approaches algorithms & software v lyubin...
Olga Zinkevych
 
Evolution of words through time a malenko dataconf 21 04_18
Olga Zinkevych
 
What it takes to build a model for detecting patients that defaults from medi...
Olga Zinkevych
 
Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18
Olga Zinkevych
 
Dataservices based on mesos and kafka kostiantyn bokhan dataconf 21 04 18
Olga Zinkevych
 
Azure data catalog your data your way eugene polonichko dataconf 21 04 18
Olga Zinkevych
 
Aibdconference chat bot for every product Maksym Volchenko
Olga Zinkevych
 
Ai&bigdataconference oleksandr saienko machine learning use cases in telecom
Olga Zinkevych
 
Ai big dataconference_volodymyr getmanskyi colorization distance measuring
Olga Zinkevych
 
Ai big dataconference_taras firman how to build advanced prediction with addi...
Olga Zinkevych
 
Ai big dataconference_sparkinonehour_vitalii bashun
Olga Zinkevych
 
Ai big dataconference_semantic image segmentatation using word embeddings_ole...
Olga Zinkevych
 
Ai big dataconference_ml_fastdata_vitalii bondarenko
Olga Zinkevych
 
Ai big dataconference_krakovetskyi_microsoft ai a new era of smart solutions
Olga Zinkevych
 
Ai big dataconference_jeffrey ricker_kappa_architecture
Olga Zinkevych
 

Recently uploaded (20)

PDF
Packaging Tips for Stainless Steel Tubes and Pipes
heavymetalsandtubes
 
PPTX
Tunnel Ventilation System in Kanpur Metro
220105053
 
PDF
Top 10 read articles In Managing Information Technology.pdf
IJMIT JOURNAL
 
PPTX
FUNDAMENTALS OF ELECTRIC VEHICLES UNIT-1
MikkiliSuresh
 
PPTX
IoT_Smart_Agriculture_Presentations.pptx
poojakumari696707
 
PDF
JUAL EFIX C5 IMU GNSS GEODETIC PERFECT BASE OR ROVER
Budi Minds
 
PDF
Software Testing Tools - names and explanation
shruti533256
 
PPTX
easa module 3 funtamental electronics.pptx
tryanothert7
 
PDF
Cryptography and Information :Security Fundamentals
Dr. Madhuri Jawale
 
PPTX
Civil Engineering Practices_BY Sh.JP Mishra 23.09.pptx
bineetmishra1990
 
PDF
Advanced LangChain & RAG: Building a Financial AI Assistant with Real-Time Data
Soufiane Sejjari
 
DOCX
SAR - EEEfdfdsdasdsdasdasdasdasdasdasdasda.docx
Kanimozhi676285
 
PDF
67243-Cooling and Heating & Calculation.pdf
DHAKA POLYTECHNIC
 
PDF
Introduction to Data Science: data science process
ShivarkarSandip
 
PDF
Traditional Exams vs Continuous Assessment in Boarding Schools.pdf
The Asian School
 
PDF
Zero carbon Building Design Guidelines V4
BassemOsman1
 
PDF
top-5-use-cases-for-splunk-security-analytics.pdf
yaghutialireza
 
PDF
The Effect of Artifact Removal from EEG Signals on the Detection of Epileptic...
Partho Prosad
 
PDF
67243-Cooling and Heating & Calculation.pdf
DHAKA POLYTECHNIC
 
PDF
July 2025: Top 10 Read Articles Advanced Information Technology
ijait
 
Packaging Tips for Stainless Steel Tubes and Pipes
heavymetalsandtubes
 
Tunnel Ventilation System in Kanpur Metro
220105053
 
Top 10 read articles In Managing Information Technology.pdf
IJMIT JOURNAL
 
FUNDAMENTALS OF ELECTRIC VEHICLES UNIT-1
MikkiliSuresh
 
IoT_Smart_Agriculture_Presentations.pptx
poojakumari696707
 
JUAL EFIX C5 IMU GNSS GEODETIC PERFECT BASE OR ROVER
Budi Minds
 
Software Testing Tools - names and explanation
shruti533256
 
easa module 3 funtamental electronics.pptx
tryanothert7
 
Cryptography and Information :Security Fundamentals
Dr. Madhuri Jawale
 
Civil Engineering Practices_BY Sh.JP Mishra 23.09.pptx
bineetmishra1990
 
Advanced LangChain & RAG: Building a Financial AI Assistant with Real-Time Data
Soufiane Sejjari
 
SAR - EEEfdfdsdasdsdasdasdasdasdasdasdasda.docx
Kanimozhi676285
 
67243-Cooling and Heating & Calculation.pdf
DHAKA POLYTECHNIC
 
Introduction to Data Science: data science process
ShivarkarSandip
 
Traditional Exams vs Continuous Assessment in Boarding Schools.pdf
The Asian School
 
Zero carbon Building Design Guidelines V4
BassemOsman1
 
top-5-use-cases-for-splunk-security-analytics.pdf
yaghutialireza
 
The Effect of Artifact Removal from EEG Signals on the Detection of Epileptic...
Partho Prosad
 
67243-Cooling and Heating & Calculation.pdf
DHAKA POLYTECHNIC
 
July 2025: Top 10 Read Articles Advanced Information Technology
ijait
 

Ai big dataconference_eugene_polonichko_azure data lake