SlideShare a Scribd company logo
Pipelines and Packages: Introduction to Azure Data Factory (Techorama NL 2019)
Pipelines and Packages:
Introduction to Azure Data Factory
Cathrine Wilhelmsen
Techorama NL · Oct 2, 2019
Pipelines and Packages: Introduction to Azure Data Factory
As Data Engineers and ETL Developers, our main responsibilities are to move, transform, integrate and
prepare data for our end users as quickly and efficiently as possible. With the ever-increasing volume
and variety of data, this can easily start to feel like a daunting task.
Azure Data Factory (ADF) is a hybrid data integration service that lets you build, orchestrate and monitor
complex and scalable data pipelines - without writing any code. The first version of Azure Data Factory
may not have lived entirely up to its nickname "SSIS in the Cloud", but the second version has been
drastically improved and expanded with new capabilities.
But wait, what's that? You have already invested years and millions in a comprehensive SSIS solution, you
say? No problem! You can lift and shift your existing SSIS packages into Azure Data Factory to start
modernizing your solution while retaining the investments you have already made.
In this session, we will first go through the fundamentals of Azure Data Factory and see how easy it is to
build new data pipelines or migrate your existing SSIS packages. Then, we will explore some of the major
improvements in Azure Data Factory v2, including the new Mapping Data Flows. Finally, we will look at
design patterns and best practices for development to speed up productivity while keeping costs down.
@cathrinew
cathrinew.net
Data Warehousing Business Intelligence
Artificial Intelligence
Big Data and Analytics
Machine Learning
Data Science
Data Warehousing Business Intelligence
Artificial Intelligence
Big Data and Analytics
Machine Learning
Data Science
What?
When?
Why?
Collect
Store
Transform
Integrate
Prepare
Pipelines and Packages: Introduction to Azure Data Factory (Techorama NL 2019)
10 years ago…
SSIS
SQL Server Integration Services
Pipelines and Packages: Introduction to Azure Data Factory (Techorama NL 2019)
Then…
ADF v1
Azure Data Factory Version 1
Pipelines and Packages: Introduction to Azure Data Factory (Techorama NL 2019)
Pipelines and Packages: Introduction to Azure Data Factory (Techorama NL 2019)
Today…
ADF v2
Azure Data Factory Version 2
Pipelines and Packages: Introduction to Azure Data Factory (Techorama NL 2019)
Azure
Data Factory
What is Azure Data Factory?
Hybrid data integration service
Complex and scalable pipelines
No-code ETL/ELT data flows
What can you do in Azure Data Factory?
Copy Data Transform Data
What is inside Azure Data Factory?
Pipelines
Activities
Datasets
Linked
Services
Data Flows
Templates
Triggers
DEMO
Let's look inside
Azure Data Factory!
Wait…
I already have
thousands of SSIS
packages!
And…
You told me to use
this Biml thing!
SSIS
Lift and Shift
What does Lift and Shift mean?
Lift up existing SSIS packages
Shift them to a new location
Why should you Lift and Shift SSIS?
Modernize while retaining investments
Continue to use familiar tools
Reduce maintenance and costs (*)
How do you Lift and Shift SSIS?
1. Configure Azure-SSIS Integration Runtime
2. Deploy SSIS Packages to SSISDB in Azure
3. Orchestrate SSIS Packages in Azure Data Factory
How do you Lift and Shift SSIS?
1. Configure Azure-SSIS Integration Runtime
2. Deploy SSIS Packages to SSISDB in Azure
3. Orchestrate SSIS Packages in Azure Data Factory
Azure-SSIS Integration Runtime
Managed cluster of Azure VMs dedicated to SSIS
Billed while running (like all VMs)
Manage cost by running when necessary
DEMO
Let’s explore SSIS in
Azure Data Factory!
Pipeline
Linked Service
Source
Sink
Activity
Data Flow
≈
≈
≈
≈
≈
≈
Package
Connection Manager
Source
Destination
Control Flow Task
Data Flow
ADF vs SSIS
Pipeline
Linked Service
Source
Sink
Activity
Data Flow
≈
≈
≈
≈
≈
≈
Package
Connection Manager
Source
Destination
Control Flow Task
Data Flow
ADF vs SSIS
Mapping
Data Flows
What are Mapping Data Flows?
Data transformation at scale
Runs on Azure Databricks
Visual editor, no-code experience
How do Mapping Data Flows work?
Why use Mapping Data Flows?
Transform Data
Upsert Data
Load a Data Warehouse
Handle schema drift
What is Schema Drift?
Rapidly changing source files and metadata:
• Added / Removed Columns
• Renamed Column Names
• Changed Data Types
If not handled properly, Schema Drift can (and most likely
will) cause problems in the upstream pipeline
Schema Drift in SSIS
Schema Drift in ADF
Oh no!
DEMO
Let's create some
mapping data flows!
Lessons Learned
In ADF, everything has a price
SSIS best practices != ADF best practices
Learn how to learn and adapt
Good luck!
@cathrinew
cathrinew.net
hi@cathrinew.net
thank you!

More Related Content

What's hot (20)

PPTX
A lap around Azure Data Factory
BizTalk360
 
PDF
J1 T1 4 - Azure Data Factory vs SSIS - Regis Baccaro
MS Cloud Summit
 
PDF
DataOps for the Modern Data Warehouse on Microsoft Azure @ NDCOslo 2020 - Lac...
Lace Lofranco
 
PPTX
ETL in the Cloud With Microsoft Azure
Mark Kromer
 
PPTX
Intro to Azure Data Factory v1
Eric Bragas
 
PPTX
Azure Data Factory
HARIHARAN R
 
PDF
Azure Data Factory v2
Sergio Zenatti Filho
 
PPTX
Azure Data Factory ETL Patterns in the Cloud
Mark Kromer
 
PPTX
Modern data warehouse
Rakesh Jayaram
 
PDF
Designing a modern data warehouse in azure
Antonios Chatzipavlis
 
PDF
Microsoft Ignite AU 2017 - Orchestrating Big Data Pipelines with Azure Data F...
Lace Lofranco
 
PDF
Cortana Analytics Workshop: Azure Data Lake
MSAdvAnalytics
 
PPTX
Azure data factory
BizTalk360
 
PPTX
Microsoft Azure BI Solutions in the Cloud
Mark Kromer
 
PPTX
Analyzing StackExchange data with Azure Data Lake
BizTalk360
 
PDF
Unleash the power of Azure Data Factory
Sergio Zenatti Filho
 
PPTX
Azure Data Factory for Redmond SQL PASS UG Sept 2018
Mark Kromer
 
PDF
Azure SQL Data Warehouse
Antonios Chatzipavlis
 
PDF
Using Redash for SQL Analytics on Databricks
Databricks
 
PPTX
Microsoft Build 2018 Analytic Solutions with Azure Data Factory and Azure SQL...
Mark Kromer
 
A lap around Azure Data Factory
BizTalk360
 
J1 T1 4 - Azure Data Factory vs SSIS - Regis Baccaro
MS Cloud Summit
 
DataOps for the Modern Data Warehouse on Microsoft Azure @ NDCOslo 2020 - Lac...
Lace Lofranco
 
ETL in the Cloud With Microsoft Azure
Mark Kromer
 
Intro to Azure Data Factory v1
Eric Bragas
 
Azure Data Factory
HARIHARAN R
 
Azure Data Factory v2
Sergio Zenatti Filho
 
Azure Data Factory ETL Patterns in the Cloud
Mark Kromer
 
Modern data warehouse
Rakesh Jayaram
 
Designing a modern data warehouse in azure
Antonios Chatzipavlis
 
Microsoft Ignite AU 2017 - Orchestrating Big Data Pipelines with Azure Data F...
Lace Lofranco
 
Cortana Analytics Workshop: Azure Data Lake
MSAdvAnalytics
 
Azure data factory
BizTalk360
 
Microsoft Azure BI Solutions in the Cloud
Mark Kromer
 
Analyzing StackExchange data with Azure Data Lake
BizTalk360
 
Unleash the power of Azure Data Factory
Sergio Zenatti Filho
 
Azure Data Factory for Redmond SQL PASS UG Sept 2018
Mark Kromer
 
Azure SQL Data Warehouse
Antonios Chatzipavlis
 
Using Redash for SQL Analytics on Databricks
Databricks
 
Microsoft Build 2018 Analytic Solutions with Azure Data Factory and Azure SQL...
Mark Kromer
 

Similar to Pipelines and Packages: Introduction to Azure Data Factory (Techorama NL 2019) (20)

PDF
Pipelines and Packages: Introduction to Azure Data Factory (DATA:Scotland 2019)
Cathrine Wilhelmsen
 
PDF
Pipelines and Packages: Introduction to Azure Data Factory (24HOP)
Cathrine Wilhelmsen
 
PDF
Azure Data Factory Introduction.pdf
MaheshPandit16
 
PDF
Azure Data Factory for the SSIS Developer (SentryOne Webinar)
Cathrine Wilhelmsen
 
PPTX
SQL Saturday Redmond 2019 ETL Patterns in the Cloud
Mark Kromer
 
PPTX
Best Azure Data Engineer Training - Best Data Engineer Course in Hyderabad.pptx
eshwarvisualpath
 
PDF
Azure Data Factory Interview Questions PDF By ScholarHat
Scholarhat
 
PPTX
Build ETL Process using Azure Data Factory
Manoj Mittal
 
PPTX
Lift SSIS package to Azure Data Factory V2
Manjeet Singh
 
PPTX
ADF Demo_ppt.pptx
vamsytaurus
 
PDF
Creating Visual Transformations in Azure Data Factory (dataMinds Connect)
Cathrine Wilhelmsen
 
PPTX
Intelligent Cloud Conference 2018 - Next Generation of Data Integration with ...
Tom Kerkhove
 
PDF
Azure Data Engineer Training In Hyderabad | Azure Data Engineer Training
eshwarvisualpath
 
PPTX
Next Generation of Data Integration with Azure Data Factory by Tom Kerkhove
Codit
 
PPTX
Next Generation Data Integration with Azure Data Factory
Tom Kerkhove
 
PDF
Unleash the Power of Azure Data Factory - SQL User Group
Sergio Zenatti Filho
 
PPTX
Azure datafactory
Dimko Zhluktenko
 
PDF
Mapping Data Flows in Azure Data Factory 1st Edition Mark Kromer
divacazokey
 
PDF
Azure Data Factory v2
inovex GmbH
 
PPTX
Deep Dive into Azure Data Factory v2
Eric Bragas
 
Pipelines and Packages: Introduction to Azure Data Factory (DATA:Scotland 2019)
Cathrine Wilhelmsen
 
Pipelines and Packages: Introduction to Azure Data Factory (24HOP)
Cathrine Wilhelmsen
 
Azure Data Factory Introduction.pdf
MaheshPandit16
 
Azure Data Factory for the SSIS Developer (SentryOne Webinar)
Cathrine Wilhelmsen
 
SQL Saturday Redmond 2019 ETL Patterns in the Cloud
Mark Kromer
 
Best Azure Data Engineer Training - Best Data Engineer Course in Hyderabad.pptx
eshwarvisualpath
 
Azure Data Factory Interview Questions PDF By ScholarHat
Scholarhat
 
Build ETL Process using Azure Data Factory
Manoj Mittal
 
Lift SSIS package to Azure Data Factory V2
Manjeet Singh
 
ADF Demo_ppt.pptx
vamsytaurus
 
Creating Visual Transformations in Azure Data Factory (dataMinds Connect)
Cathrine Wilhelmsen
 
Intelligent Cloud Conference 2018 - Next Generation of Data Integration with ...
Tom Kerkhove
 
Azure Data Engineer Training In Hyderabad | Azure Data Engineer Training
eshwarvisualpath
 
Next Generation of Data Integration with Azure Data Factory by Tom Kerkhove
Codit
 
Next Generation Data Integration with Azure Data Factory
Tom Kerkhove
 
Unleash the Power of Azure Data Factory - SQL User Group
Sergio Zenatti Filho
 
Azure datafactory
Dimko Zhluktenko
 
Mapping Data Flows in Azure Data Factory 1st Edition Mark Kromer
divacazokey
 
Azure Data Factory v2
inovex GmbH
 
Deep Dive into Azure Data Factory v2
Eric Bragas
 
Ad

More from Cathrine Wilhelmsen (20)

PDF
Fra utvikler til arkitekt: Skap din egen karrierevei ved å utvikle din person...
Cathrine Wilhelmsen
 
PDF
One Year in Fabric: Lessons Learned from Implementing Real-World Projects (PA...
Cathrine Wilhelmsen
 
PDF
Data Factory in Microsoft Fabric (MsBIP #82)
Cathrine Wilhelmsen
 
PDF
Getting Started: Data Factory in Microsoft Fabric (Microsoft Fabric Community...
Cathrine Wilhelmsen
 
PDF
Choosing Between Microsoft Fabric, Azure Synapse Analytics and Azure Data Fac...
Cathrine Wilhelmsen
 
PDF
Website Analytics in My Pocket using Microsoft Fabric (SQLBits 2024)
Cathrine Wilhelmsen
 
PDF
Data Integration using Data Factory in Microsoft Fabric (ESPC Microsoft Fabri...
Cathrine Wilhelmsen
 
PDF
Choosing between Fabric, Synapse and Databricks (Data Left Unattended 2023)
Cathrine Wilhelmsen
 
PDF
Data Integration with Data Factory (Microsoft Fabric Day Oslo 2023)
Cathrine Wilhelmsen
 
PDF
The Battle of the Data Transformation Tools (PASS Data Community Summit 2023)
Cathrine Wilhelmsen
 
PDF
Visually Transform Data in Azure Data Factory or Azure Synapse Analytics (PAS...
Cathrine Wilhelmsen
 
PDF
Building an End-to-End Solution in Microsoft Fabric: From Dataverse to Power ...
Cathrine Wilhelmsen
 
PDF
Website Analytics in my Pocket using Microsoft Fabric (AdaCon 2023)
Cathrine Wilhelmsen
 
PDF
Choosing Between Microsoft Fabric, Azure Synapse Analytics and Azure Data Fac...
Cathrine Wilhelmsen
 
PDF
Stressed, Depressed, or Burned Out? The Warning Signs You Shouldn't Ignore (D...
Cathrine Wilhelmsen
 
PDF
Stressed, Depressed, or Burned Out? The Warning Signs You Shouldn't Ignore (S...
Cathrine Wilhelmsen
 
PDF
"I can't keep up!" - Turning Discomfort into Personal Growth in a Fast-Paced ...
Cathrine Wilhelmsen
 
PDF
Lessons Learned: Implementing Azure Synapse Analytics in a Rapidly-Changing S...
Cathrine Wilhelmsen
 
PDF
6 Tips for Building Confidence as a Public Speaker (SQLBits 2022)
Cathrine Wilhelmsen
 
PDF
Lessons Learned: Understanding Pipeline Pricing in Azure Data Factory and Azu...
Cathrine Wilhelmsen
 
Fra utvikler til arkitekt: Skap din egen karrierevei ved å utvikle din person...
Cathrine Wilhelmsen
 
One Year in Fabric: Lessons Learned from Implementing Real-World Projects (PA...
Cathrine Wilhelmsen
 
Data Factory in Microsoft Fabric (MsBIP #82)
Cathrine Wilhelmsen
 
Getting Started: Data Factory in Microsoft Fabric (Microsoft Fabric Community...
Cathrine Wilhelmsen
 
Choosing Between Microsoft Fabric, Azure Synapse Analytics and Azure Data Fac...
Cathrine Wilhelmsen
 
Website Analytics in My Pocket using Microsoft Fabric (SQLBits 2024)
Cathrine Wilhelmsen
 
Data Integration using Data Factory in Microsoft Fabric (ESPC Microsoft Fabri...
Cathrine Wilhelmsen
 
Choosing between Fabric, Synapse and Databricks (Data Left Unattended 2023)
Cathrine Wilhelmsen
 
Data Integration with Data Factory (Microsoft Fabric Day Oslo 2023)
Cathrine Wilhelmsen
 
The Battle of the Data Transformation Tools (PASS Data Community Summit 2023)
Cathrine Wilhelmsen
 
Visually Transform Data in Azure Data Factory or Azure Synapse Analytics (PAS...
Cathrine Wilhelmsen
 
Building an End-to-End Solution in Microsoft Fabric: From Dataverse to Power ...
Cathrine Wilhelmsen
 
Website Analytics in my Pocket using Microsoft Fabric (AdaCon 2023)
Cathrine Wilhelmsen
 
Choosing Between Microsoft Fabric, Azure Synapse Analytics and Azure Data Fac...
Cathrine Wilhelmsen
 
Stressed, Depressed, or Burned Out? The Warning Signs You Shouldn't Ignore (D...
Cathrine Wilhelmsen
 
Stressed, Depressed, or Burned Out? The Warning Signs You Shouldn't Ignore (S...
Cathrine Wilhelmsen
 
"I can't keep up!" - Turning Discomfort into Personal Growth in a Fast-Paced ...
Cathrine Wilhelmsen
 
Lessons Learned: Implementing Azure Synapse Analytics in a Rapidly-Changing S...
Cathrine Wilhelmsen
 
6 Tips for Building Confidence as a Public Speaker (SQLBits 2022)
Cathrine Wilhelmsen
 
Lessons Learned: Understanding Pipeline Pricing in Azure Data Factory and Azu...
Cathrine Wilhelmsen
 
Ad

Recently uploaded (20)

PPTX
Dr djdjjdsjsjsjsjsjsjjsjdjdjdjdjjd1.pptx
Nandy31
 
PPTX
Advanced_NLP_with_Transformers_PPT_final 50.pptx
Shiwani Gupta
 
PPTX
GenAI-Introduction-to-Copilot-for-Bing-March-2025-FOR-HUB.pptx
cleydsonborges1
 
PPTX
The _Operations_on_Functions_Addition subtruction Multiplication and Division...
mdregaspi24
 
PPTX
apidays Helsinki & North 2025 - Agentic AI: A Friend or Foe?, Merja Kajava (A...
apidays
 
PPTX
ER_Model_with_Diagrams_Presentation.pptx
dharaadhvaryu1992
 
PPTX
AI Presentation Tool Pitch Deck Presentation.pptx
ShyamPanthavoor1
 
PDF
Data Chunking Strategies for RAG in 2025.pdf
Tamanna
 
PPTX
apidays Helsinki & North 2025 - Vero APIs - Experiences of API development in...
apidays
 
PPT
Growth of Public Expendituuure_55423.ppt
NavyaDeora
 
PDF
apidays Helsinki & North 2025 - Monetizing AI APIs: The New API Economy, Alla...
apidays
 
PDF
Web Scraping with Google Gemini 2.0 .pdf
Tamanna
 
PDF
Copia de Strategic Roadmap Infographics by Slidesgo.pptx (1).pdf
ssuserd4c6911
 
PDF
Choosing the Right Database for Indexing.pdf
Tamanna
 
PDF
What does good look like - CRAP Brighton 8 July 2025
Jan Kierzyk
 
PDF
AUDITABILITY & COMPLIANCE OF AI SYSTEMS IN HEALTHCARE
GAHI Youssef
 
PPTX
apidays Helsinki & North 2025 - Running a Successful API Program: Best Practi...
apidays
 
PPTX
Numbers of a nation: how we estimate population statistics | Accessible slides
Office for National Statistics
 
PPTX
Climate Action.pptx action plan for climate
justfortalabat
 
PPTX
recruitment Presentation.pptxhdhshhshshhehh
devraj40467
 
Dr djdjjdsjsjsjsjsjsjjsjdjdjdjdjjd1.pptx
Nandy31
 
Advanced_NLP_with_Transformers_PPT_final 50.pptx
Shiwani Gupta
 
GenAI-Introduction-to-Copilot-for-Bing-March-2025-FOR-HUB.pptx
cleydsonborges1
 
The _Operations_on_Functions_Addition subtruction Multiplication and Division...
mdregaspi24
 
apidays Helsinki & North 2025 - Agentic AI: A Friend or Foe?, Merja Kajava (A...
apidays
 
ER_Model_with_Diagrams_Presentation.pptx
dharaadhvaryu1992
 
AI Presentation Tool Pitch Deck Presentation.pptx
ShyamPanthavoor1
 
Data Chunking Strategies for RAG in 2025.pdf
Tamanna
 
apidays Helsinki & North 2025 - Vero APIs - Experiences of API development in...
apidays
 
Growth of Public Expendituuure_55423.ppt
NavyaDeora
 
apidays Helsinki & North 2025 - Monetizing AI APIs: The New API Economy, Alla...
apidays
 
Web Scraping with Google Gemini 2.0 .pdf
Tamanna
 
Copia de Strategic Roadmap Infographics by Slidesgo.pptx (1).pdf
ssuserd4c6911
 
Choosing the Right Database for Indexing.pdf
Tamanna
 
What does good look like - CRAP Brighton 8 July 2025
Jan Kierzyk
 
AUDITABILITY & COMPLIANCE OF AI SYSTEMS IN HEALTHCARE
GAHI Youssef
 
apidays Helsinki & North 2025 - Running a Successful API Program: Best Practi...
apidays
 
Numbers of a nation: how we estimate population statistics | Accessible slides
Office for National Statistics
 
Climate Action.pptx action plan for climate
justfortalabat
 
recruitment Presentation.pptxhdhshhshshhehh
devraj40467
 

Pipelines and Packages: Introduction to Azure Data Factory (Techorama NL 2019)