Why shift from ETL to ELT?
Author: Prakash Jalihal
Contributor: Vedvrat Shikarpur
The process of data warehousing is undergoing rapid transformation, giving rise to various new
terminologies, especially due to the shift from the traditional ETL to the new ELT. For someone new to
the process, these additional terminologies and abbreviations might seem overwhelming, some may
even ask, “Why does it matter if the L comes before the T?”
The answer lies in the infrastructure and the setup. Here is what the fuss is all about, the sequencing of
the words and more importantly, why you should be shifting from ETL to ELT.
Understanding Data Warehousing Processes
Data Warehouse or Enterprise Data Warehouse (EDW) is a system implemented for the purpose of
reporting and data analysis. They are central repositories of integrated data from disparate sources used
for generating reports.
The popular definition from Bill Inmon is, “It is a subject oriented, integrated, time variant and non-
volatile collection of data used for decision making process.”1
 Subject oriented: A data warehouse can be used to analyze a particular subject area.
 Integrated: A data warehouse integrates data from one or more disparate data sources.
 Time variant: Historical data is stored in a data warehouse.
 Nonvolatile: Once data is input in a data warehouse, it cannot be changed or altered.
What is ETL?
ETL stands for extraction, transformation and loading, and is the process of extracting data from the
source system to the data warehouse. They are critical components for feeding a data warehouse, a
business intelligence system or a big data platform.
The ETL processes are:
 Extraction: Extracts raw data into databases or storage systems
 Transformation: Simplifies data to reconcile it across source systems, perform analysis and
enrich with external lookup information. This stage also matches the format required by the
target system.
 Loading: Sourcing the resultant data into various business intelligence (BI) tools, data
warehouse or EDW, etc.
Advantages of ETL:
1. Single view interface to integrate heterogeneous data
2. Ability to join data both at the source and at the integration server with the addition of the
option to apply any business rule from within a single interface.
3. Common data infrastructure for working on data movement and data quality.
4. Parallel Processing Engine for providing exceptional performance and scalability.
1
Beye Network: Is Inmon's Data Warehouse Definition Still Accurate?
Shortcomings of ETL:
1. Migration from server to enterprise edition might require vast time and resources due to the
innumerable architectural differences in the Server and Enterprise edition.
2. No automated error handling or recovery mechanism.
3. Expensive as a solution for small or midsized companies.2
What is ELT?
Until recently, it was normal to stage data into an intermediate system before pushing it into the target
system as the target was better optimized to retrieve and report (and not to perform hard crunching of
numbers or data). This is why many preferred the ETL process, where the intermediate system would be
optimized to perform calculations and data transformation (this is the reason we call this process
transformation). This approach kept the target reporting system independent of the implementation
method during transform stage, resulting in organizations implementing three separate systems to
satisfy the requirements of each stage.
Since hardware systems today are better equipped and capable of doing a lot more, reporting and
calculations can be performed using the same system. This is where the ELT implementation comes in.
ELT stands for Extract, Load, Transform. It is an alternative to ETL as it implements the data lake. In ELT
models, data is processed on entry to the data lake, resulting in faster loading times. In most cases, the
design of the transformational technology ties closely into the platform used for reporting, giving ETL
the advantage of a better hardware and software sync up.
Advantages:
1. No need for a separate transformation engine, the work is done by the target system itself.
2. Data transformation and loading happen in parallel, so less time and resources are spent (as
only filtered, clean data is loaded into the target system)
2
ETL Tools: Major business and technical advantages and disadvantages of using DataStage ETL tool
3. ELT works with high-end data engines such as Hadoop cluster, cloud or data appliances. This
gives is additional performance and security.
4. The processing capability of data warehousing infrastructure reduces time that data spends in
transit and makes the system more cost effective.
Disadvantages:
1. The specifics of ELT development vary on platform i.e. Hadoop clusters work by breaking a
problem into smaller chunks, then distributing those chunks across a large number of machines
for processing. Some problems can be easily split, others will be much harder.
2. Developers need to be aware of the nature of the system they’re using to perform
transformations. While some systems can handle nearly any transformation, others do not have
enough resources, requiring careful planning and design.3
Comparison: ETL vs ELT
Although ETL and ELT are vastly different in terms of architecture and implementation, the main
difference lies in the rethinking of approach taken to transferring data into reporting systems. ELT takes
full advantage of technology and along the way enhances the reporting solution with added values like
tracing of data points.4
Another main attraction of ELT is the reduction in load time and the time that data is in transit, making it
not just efficient but even cost-effective. Even though ELT requires a high-end system, it drastically
reduces the number of components required. 5
Thus, despite ELT implementation being more complex compared to the one way transaction-system-to-
reporting ETL, ELT is now being preferred. Designing a proper ELT system might take some work, but the
payoff is well worth it!
In banking terms, only the data of value ends up in the Data Warehouse for ETL processes. What this
mean is that you Extract the needed data into a staging area (in relational term often staging tables or
the so called global temporary tables), segregate it from unwanted data, perform data manipulation
(Transformation) and finally Load it into target tables in a Data Warehouse. Analysts then use
appropriate BI Tools to look at macroscopic trends in the data. This makes the process of data matching,
3
Ironside: ETL vs. ELT – What’s the Big Difference?
4
Blog: Performance Architects; Difference between ETL and ELT
5
TechTarget: Extract, Load, Transform (ELT) definition
Read more about HEXANIKA’s DRAAS solution at: http://hexanika.com/company-profile/
This is where ELT works best, for it is not just confined to data deemed to be of specific value. Hadoop
(HDFS systems) can store everything from structured data (transactional databases) and unstructured
data (coming for excel sheets, emails, logs, internet, and other). As raw data and transformed data are
saved on the same machine, data linkage and lineage processes are a lot faster and more accurate. This
also drastically reduces the Total Cost of Ownership (TCO) which is an attractive proposition for various
financial institutions using Big Data Storage systems.6
In summary, ELT allows you to extract and load all data as is into HDFS, and then you can do
Transformation through Schema on Read, thereby simplifying the process of Data Warehousing.
Hexanika: Efficient, Simple and Smart!
Hexanika is a FinTech Big Data software company, which has developed a revolutionary software
platform called SmartJoinTM
for financial institutions to address data sourcing and reporting challenges
for regulatory compliance. SmartJoinTM
improves data quality while the automated nature of
6
LinkedIn Pulse: ETL or ELT and the Use Case
SmartRegTM
keeps regulatory reporting in harmony with the dynamic regulatory requirements and keeps
pace with the new developments and latest regulatory updates.
Hexanika leverages the power of ELT using distributed parallel processing, Big Data/Hadoop technology
with a secure data cloud (IBM Cloud). Understanding the high implementation costs of new systems and
the complexities involved in redesigning existing solutions, Hexanika offers a unique build that adapts to
existing architectures. This makes our solution cost-effective, efficient, simple and smart!
Read more about our solution and architecture at: http://hexanika.com/big-data-solution-architecture/
CONTACT US
USA
249 East 48 Street,
New York, NY 10017
Tel: +1 646.733.6636
INDIA
Krupa Bungalow 1187/10,
Shivaji Nagar, Pune 411005
Tel: +91 985068686
Email: info@hexanika.com

More Related Content

PPTX
Etl elt simplified
PPTX
Data warehouse,data mining & Big Data
PPTX
Data Architecture Brief Overview
PDF
PPTX
Data Warehouse
PDF
Talend Open Studio Data Integration
PPTX
Designing modern dw and data lake
Etl elt simplified
Data warehouse,data mining & Big Data
Data Architecture Brief Overview
Data Warehouse
Talend Open Studio Data Integration
Designing modern dw and data lake

What's hot (20)

PPTX
Data warehouse presentaion
PDF
Enabling a Data Mesh Architecture with Data Virtualization
PDF
Data Pipline Observability meetup
PPTX
What is ETL?
PDF
Introduction to Data Visualization
PDF
Data Mesh for Dinner
PDF
Intro to Delta Lake
PDF
Data engineering design patterns
PPTX
Data Lakehouse, Data Mesh, and Data Fabric (r2)
PDF
Databricks Delta Lake and Its Benefits
PDF
Data Warehouse Agility Array Conference2011
PPTX
Basic oracle-database-administration
PPTX
ETL Process
PPTX
Data warehousing and data mart
PDF
Introduction to ETL and Data Integration
PPTX
Building a modern data warehouse
PDF
Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...
PPT
Date warehousing concepts
PDF
Improving Data Literacy Around Data Architecture
PPTX
ETL Testing Overview
Data warehouse presentaion
Enabling a Data Mesh Architecture with Data Virtualization
Data Pipline Observability meetup
What is ETL?
Introduction to Data Visualization
Data Mesh for Dinner
Intro to Delta Lake
Data engineering design patterns
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Databricks Delta Lake and Its Benefits
Data Warehouse Agility Array Conference2011
Basic oracle-database-administration
ETL Process
Data warehousing and data mart
Introduction to ETL and Data Integration
Building a modern data warehouse
Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...
Date warehousing concepts
Improving Data Literacy Around Data Architecture
ETL Testing Overview
Ad

Viewers also liked (7)

PDF
Weducate initiative mcmd project implementation progress report 2015
PDF
Shijia wang_portfolio
PPTX
Maslow
PDF
What Does it Really Take to Launch, Lead and Grow a Startup?
PDF
1-3. Australia Startup Ecosystem_021517
PDF
Design 'super' sprint
PDF
Mm datas comemorativas_genérico_17_br nicke
Weducate initiative mcmd project implementation progress report 2015
Shijia wang_portfolio
Maslow
What Does it Really Take to Launch, Lead and Grow a Startup?
1-3. Australia Startup Ecosystem_021517
Design 'super' sprint
Mm datas comemorativas_genérico_17_br nicke
Ad

Similar to Why shift from ETL to ELT? (20)

PPTX
ETL Technologies.pptx
PPTX
Lecture13- Extract Transform Load presentation.pptx
PDF
Automation Tools That Simplify ETL and ELT Migration Tasks
PDF
ETL VS ELT.pdf
PDF
Data Migration vs ETL Know Key Difference
PDF
ETL vs ELT
PDF
What is ETL and Zero ETL | Extract, Transform, Load
PPT
Should ETL Become Obsolete
PDF
A Comparitive Study Of ETL Tools
PDF
ETL Tools Ankita Dubey
PDF
Big data analytics beyond beer and diapers
PPTX
Extract Transformation Load (3) (1).pptx
PPTX
Extract Transformation Loading1 (3).pptx
PDF
Why Organizations Are Shifting from EDW to Data Lake Architectures
PPTX
Data junction tool
PDF
What Is ETL | Process of ETL 2023 | GrapesTech Solutions
DOCX
What are the benefits of learning ETL Development and where to start learning...
PDF
How Data Integration Services Impact Long-Term Scalability.pdf
PPTX
GROPSIKS.pptx
PPTX
Ask On Data Uses NLP to Simplify ETL.pptx
ETL Technologies.pptx
Lecture13- Extract Transform Load presentation.pptx
Automation Tools That Simplify ETL and ELT Migration Tasks
ETL VS ELT.pdf
Data Migration vs ETL Know Key Difference
ETL vs ELT
What is ETL and Zero ETL | Extract, Transform, Load
Should ETL Become Obsolete
A Comparitive Study Of ETL Tools
ETL Tools Ankita Dubey
Big data analytics beyond beer and diapers
Extract Transformation Load (3) (1).pptx
Extract Transformation Loading1 (3).pptx
Why Organizations Are Shifting from EDW to Data Lake Architectures
Data junction tool
What Is ETL | Process of ETL 2023 | GrapesTech Solutions
What are the benefits of learning ETL Development and where to start learning...
How Data Integration Services Impact Long-Term Scalability.pdf
GROPSIKS.pptx
Ask On Data Uses NLP to Simplify ETL.pptx

More from HEXANIKA (15)

PDF
Why is Regulatory Reporting tough?
PDF
Scope of Data Integration
PDF
How Big Data helps banks know their customers better
PDF
Sandbox in Financial Services
PDF
High regulatory costs for small and mid sized banks
DOCX
Automation in Banking
PDF
Regulatory Pain Points For Small And Medium Sized Banks
PDF
Understanding SAR (Suspicious Activity Reporting)
PDF
History of Big Data
PDF
FATCA: why is it so difficult even after so many years?
PDF
The Volcker Rule: Its Implications and Aftereffects
PDF
A summary of Solvency II Directives
PDF
A Review of BCBS 239: Helping banks stay compliant
PDF
Dodd-Frank's Impact on Regulatory Reporting
DOCX
Regulatory impact on small and midsize banks
Why is Regulatory Reporting tough?
Scope of Data Integration
How Big Data helps banks know their customers better
Sandbox in Financial Services
High regulatory costs for small and mid sized banks
Automation in Banking
Regulatory Pain Points For Small And Medium Sized Banks
Understanding SAR (Suspicious Activity Reporting)
History of Big Data
FATCA: why is it so difficult even after so many years?
The Volcker Rule: Its Implications and Aftereffects
A summary of Solvency II Directives
A Review of BCBS 239: Helping banks stay compliant
Dodd-Frank's Impact on Regulatory Reporting
Regulatory impact on small and midsize banks

Recently uploaded (20)

PPTX
28 - relative valuation lecture economicsnotes
PPTX
2. RBI.pptx202029291023i38039013i92292992
PDF
Pitch Deck.pdf .pdf all about finance in
PPTX
Maths science sst hindi english cucumber
PDF
Best Accounting Outsourcing Companies in The USA
PDF
HCWM AND HAI FOR BHCM STUDENTS(1).Pdf and ptts
DOCX
Final. 150 minutes exercise agrumentative Essay
PPTX
PROFITS AND GAINS OF BUSINESS OR PROFESSION 2024.pptx
PDF
2018_Simulating Hedge Fund Strategies Generalising Fund Performance Presentat...
PPTX
Grp C.ppt presentation.pptx for Economics
PPTX
Machine Learning (ML) is a branch of Artificial Intelligence (AI)
PPT
KPMG FA Benefits Report_FINAL_Jan 27_2010.ppt
PPTX
lesson in englishhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
PDF
Unkipdf.pdf of work in the economy we are
PPT
CompanionAsset_9780128146378_Chapter04.ppt
PPT
features and equilibrium under MONOPOLY 17.11.20.ppt
PDF
The Right Social Media Strategy Can Transform Your Business
PDF
Buy Verified Stripe Accounts for Sale - Secure and.pdf
PDF
Principal of magaement is good fundamentals in economics
28 - relative valuation lecture economicsnotes
2. RBI.pptx202029291023i38039013i92292992
Pitch Deck.pdf .pdf all about finance in
Maths science sst hindi english cucumber
Best Accounting Outsourcing Companies in The USA
HCWM AND HAI FOR BHCM STUDENTS(1).Pdf and ptts
Final. 150 minutes exercise agrumentative Essay
PROFITS AND GAINS OF BUSINESS OR PROFESSION 2024.pptx
2018_Simulating Hedge Fund Strategies Generalising Fund Performance Presentat...
Grp C.ppt presentation.pptx for Economics
Machine Learning (ML) is a branch of Artificial Intelligence (AI)
KPMG FA Benefits Report_FINAL_Jan 27_2010.ppt
lesson in englishhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
Unkipdf.pdf of work in the economy we are
CompanionAsset_9780128146378_Chapter04.ppt
features and equilibrium under MONOPOLY 17.11.20.ppt
The Right Social Media Strategy Can Transform Your Business
Buy Verified Stripe Accounts for Sale - Secure and.pdf
Principal of magaement is good fundamentals in economics

Why shift from ETL to ELT?

  • 1. Why shift from ETL to ELT? Author: Prakash Jalihal Contributor: Vedvrat Shikarpur The process of data warehousing is undergoing rapid transformation, giving rise to various new terminologies, especially due to the shift from the traditional ETL to the new ELT. For someone new to the process, these additional terminologies and abbreviations might seem overwhelming, some may even ask, “Why does it matter if the L comes before the T?” The answer lies in the infrastructure and the setup. Here is what the fuss is all about, the sequencing of the words and more importantly, why you should be shifting from ETL to ELT. Understanding Data Warehousing Processes Data Warehouse or Enterprise Data Warehouse (EDW) is a system implemented for the purpose of
  • 2. reporting and data analysis. They are central repositories of integrated data from disparate sources used for generating reports. The popular definition from Bill Inmon is, “It is a subject oriented, integrated, time variant and non- volatile collection of data used for decision making process.”1  Subject oriented: A data warehouse can be used to analyze a particular subject area.  Integrated: A data warehouse integrates data from one or more disparate data sources.  Time variant: Historical data is stored in a data warehouse.  Nonvolatile: Once data is input in a data warehouse, it cannot be changed or altered. What is ETL? ETL stands for extraction, transformation and loading, and is the process of extracting data from the source system to the data warehouse. They are critical components for feeding a data warehouse, a business intelligence system or a big data platform. The ETL processes are:  Extraction: Extracts raw data into databases or storage systems  Transformation: Simplifies data to reconcile it across source systems, perform analysis and enrich with external lookup information. This stage also matches the format required by the target system.  Loading: Sourcing the resultant data into various business intelligence (BI) tools, data warehouse or EDW, etc. Advantages of ETL: 1. Single view interface to integrate heterogeneous data 2. Ability to join data both at the source and at the integration server with the addition of the option to apply any business rule from within a single interface. 3. Common data infrastructure for working on data movement and data quality. 4. Parallel Processing Engine for providing exceptional performance and scalability. 1 Beye Network: Is Inmon's Data Warehouse Definition Still Accurate?
  • 3. Shortcomings of ETL: 1. Migration from server to enterprise edition might require vast time and resources due to the innumerable architectural differences in the Server and Enterprise edition. 2. No automated error handling or recovery mechanism. 3. Expensive as a solution for small or midsized companies.2 What is ELT? Until recently, it was normal to stage data into an intermediate system before pushing it into the target system as the target was better optimized to retrieve and report (and not to perform hard crunching of numbers or data). This is why many preferred the ETL process, where the intermediate system would be optimized to perform calculations and data transformation (this is the reason we call this process transformation). This approach kept the target reporting system independent of the implementation method during transform stage, resulting in organizations implementing three separate systems to satisfy the requirements of each stage. Since hardware systems today are better equipped and capable of doing a lot more, reporting and calculations can be performed using the same system. This is where the ELT implementation comes in. ELT stands for Extract, Load, Transform. It is an alternative to ETL as it implements the data lake. In ELT models, data is processed on entry to the data lake, resulting in faster loading times. In most cases, the design of the transformational technology ties closely into the platform used for reporting, giving ETL the advantage of a better hardware and software sync up. Advantages: 1. No need for a separate transformation engine, the work is done by the target system itself. 2. Data transformation and loading happen in parallel, so less time and resources are spent (as only filtered, clean data is loaded into the target system) 2 ETL Tools: Major business and technical advantages and disadvantages of using DataStage ETL tool
  • 4. 3. ELT works with high-end data engines such as Hadoop cluster, cloud or data appliances. This gives is additional performance and security. 4. The processing capability of data warehousing infrastructure reduces time that data spends in transit and makes the system more cost effective. Disadvantages: 1. The specifics of ELT development vary on platform i.e. Hadoop clusters work by breaking a problem into smaller chunks, then distributing those chunks across a large number of machines for processing. Some problems can be easily split, others will be much harder. 2. Developers need to be aware of the nature of the system they’re using to perform transformations. While some systems can handle nearly any transformation, others do not have enough resources, requiring careful planning and design.3 Comparison: ETL vs ELT Although ETL and ELT are vastly different in terms of architecture and implementation, the main difference lies in the rethinking of approach taken to transferring data into reporting systems. ELT takes full advantage of technology and along the way enhances the reporting solution with added values like tracing of data points.4 Another main attraction of ELT is the reduction in load time and the time that data is in transit, making it not just efficient but even cost-effective. Even though ELT requires a high-end system, it drastically reduces the number of components required. 5 Thus, despite ELT implementation being more complex compared to the one way transaction-system-to- reporting ETL, ELT is now being preferred. Designing a proper ELT system might take some work, but the payoff is well worth it! In banking terms, only the data of value ends up in the Data Warehouse for ETL processes. What this mean is that you Extract the needed data into a staging area (in relational term often staging tables or the so called global temporary tables), segregate it from unwanted data, perform data manipulation (Transformation) and finally Load it into target tables in a Data Warehouse. Analysts then use appropriate BI Tools to look at macroscopic trends in the data. This makes the process of data matching, 3 Ironside: ETL vs. ELT – What’s the Big Difference? 4 Blog: Performance Architects; Difference between ETL and ELT 5 TechTarget: Extract, Load, Transform (ELT) definition
  • 5. Read more about HEXANIKA’s DRAAS solution at: http://hexanika.com/company-profile/ This is where ELT works best, for it is not just confined to data deemed to be of specific value. Hadoop (HDFS systems) can store everything from structured data (transactional databases) and unstructured data (coming for excel sheets, emails, logs, internet, and other). As raw data and transformed data are saved on the same machine, data linkage and lineage processes are a lot faster and more accurate. This also drastically reduces the Total Cost of Ownership (TCO) which is an attractive proposition for various financial institutions using Big Data Storage systems.6 In summary, ELT allows you to extract and load all data as is into HDFS, and then you can do Transformation through Schema on Read, thereby simplifying the process of Data Warehousing. Hexanika: Efficient, Simple and Smart! Hexanika is a FinTech Big Data software company, which has developed a revolutionary software platform called SmartJoinTM for financial institutions to address data sourcing and reporting challenges for regulatory compliance. SmartJoinTM improves data quality while the automated nature of 6 LinkedIn Pulse: ETL or ELT and the Use Case
  • 6. SmartRegTM keeps regulatory reporting in harmony with the dynamic regulatory requirements and keeps pace with the new developments and latest regulatory updates. Hexanika leverages the power of ELT using distributed parallel processing, Big Data/Hadoop technology with a secure data cloud (IBM Cloud). Understanding the high implementation costs of new systems and the complexities involved in redesigning existing solutions, Hexanika offers a unique build that adapts to existing architectures. This makes our solution cost-effective, efficient, simple and smart! Read more about our solution and architecture at: http://hexanika.com/big-data-solution-architecture/ CONTACT US USA 249 East 48 Street, New York, NY 10017 Tel: +1 646.733.6636 INDIA Krupa Bungalow 1187/10, Shivaji Nagar, Pune 411005 Tel: +91 985068686 Email: [email protected]