SlideShare a Scribd company logo
By:
Dr. Alice Mugisha
alice.mugisha@mak.ac.ug
Course Outline
 Week 1: Introduction to Business Intelligence , data warehouses and
Corporate Information
 Week 2: Data warehouse architectures
 Week 3: Data warehouse lifecycle and project management
 Week 4: Introduction to Dimensional Modeling
 Week 5: Building dimensional models – tutorial session
 Week 6: Test 1 and Data extraction, transformation and loading(ETL)
 Week 7: Online analytical processing (OLAP)
 Week 8: Data mining
 Week 9: Data warehousing and the Web
 Week 10: Data warehousing and the Web
 Week 11: Matching information to users
2
Introduction
 Data is an enterprise’s most valuable asset. To
fuel innovation, which fuels growth, an
enterprise must:
 Store every relevant data point about their
business
 Give data access to everyone who needs it
 Have the ability to analyze the data in
different ways
 Distill the data down to insights
3
Introduction
 Data
 Items that are the most elementary descriptions of
things, events, activities, and transactions
 May be internal or external
 Information
 Organized data that has meaning and value
 Knowledge
 Processed data or information that conveys
understanding or learning applicable to a problem
or activity
4
Introduction
 Over the years, storage and management of data
from various operational systems has become a
great challenge.
 Long-term strategic planning has become
increasingly important in the modern global
market.
 For this reason, companies have worked towards:
 Access to information at all levels
 Survival and prosperity in a competitive world.
 The focus of technology shifted from data input
and capture through the operational systems to
information access and availability for strategic
planning.
5
6
 Executives and managers need information to
formulate the business strategies, establish goals, set
objectives, and monitor results.
 Examples of business objectives include:
 Retain the present customer base
 Increase the customer base by 15% over the next 5
years
 Gain market share by 10% in the next 3 years
 Improve product quality levels in the top five product
groups
 Enhance customer service level in shipments
 Bring three new products to market in 2 years
 Increase sales by 15% in a given supermarket branch
7
What is a Data Warehouse (DWH)?
 A decision support database that is maintained
separately from the organization’s operational database
 Support information processing by providing a solid
platform of consolidated, historical data for analysis.
 “A data warehouse is a subject-oriented,
integrated, time-variant, historical and non-
volatile collection of data in support of
management’s decision-making process.”—
W. H. Inmon
8
9
Data Warehouse—Subject-Oriented
 Organized around major subjects, such as customer,
product, sales.
 Focusing on the modeling and analysis of data for
decision makers, not on daily operations or
transaction processing.
 Provide a simple and concise view around particular
subject issues by excluding data that are not useful in
the decision support process.
10
Data Warehouse—Integrated
 Constructed by integrating multiple,
heterogeneous data sources
 relational databases, flat files, on-line transaction
records
 Data cleaning and data integration techniques
are applied.
 Ensure consistency in naming conventions, encoding
structures, attribute measures, etc. among different
data sources
 E.g., Hotel price: currency, tax, breakfast covered, etc.
 When data is moved to the warehouse, it is converted.
11
Data Warehouse—Time Variant
 The time horizon for the data warehouse is
significantly longer than that of operational systems.
 Operational database: current value data.
 Data warehouse data: provide information from a
historical perspective (e.g., past 5-10 years)
 Every key structure in the data warehouse
 Contains an element of time, explicitly or implicitly
 But the key of operational data may or may not contain
“time element”.
12
Data Warehouse—Non-Volatile
 A physically separate store of data transformed
from the operational environment.
 Operational update of data does not occur in
the data warehouse environment.
 Does not require transaction processing, recovery,
and concurrency control mechanisms
 Requires only two operations in data accessing:
 initial loading of data and access of data.
Basic concept of data warehousing
 Take all the data from the operational systems
 Where necessary, include relevant data from
outside, such as industry benchmark indicators
 Integrate all the data from the various sources
 Remove inconsistencies and transform the
data
 Store the data in formats suitable for easy
access for decision making
13
A blend of technologies
14
15
 Although a simple concept, it involves different
functions: data extraction, the function of loading
the data, transforming the data, storing the data,
and providing user interfaces.
 Different technologies are, therefore, needed to
support these functions. Although many
technologies are in use, they all work together in a
data warehouse.
 The end result is the creation of a new computing
environment for the purpose of providing the
strategic information every enterprise needs
desperately.
 There are several vendor tools available in each of
these technologies. You do not have to build your
data warehouse from scratch.
16
Data Warehouse vs. traditional integration in
Heterogeneous DBMS
 Traditional heterogeneous DB integration:
 Build wrappers/mediators on top of heterogeneous databases
 Query driven approach
 When a query is posed to a client site, a meta-dictionary is used to
translate the query into queries appropriate for individual
heterogeneous sites involved, and the results are integrated into a
global answer set
 Complex information filtering, compete for resources
 Data warehouse: update-driven, high performance
 Information from heterogeneous sources is integrated in advance
and stored in warehouses for direct query and analysis
17
Data Warehouse (OLAP) vs. Operational DBMS (OLTP)
 OLTP (on-line transaction processing)
 Major task of traditional relational DBMS
 Day-to-day operations such as purchasing, inventory,
banking, manufacturing, payroll, registration,
accounting, etc.
 OLAP (on-line analytical processing)
 Major task of data warehouse system
 Data analysis and decision making
 Distinct features (OLTP vs. OLAP):
 User and system orientation: customer vs. market
 Data contents: current, detailed vs. historical, consolidated
 Database design: ER + application vs. star + subject
 View: current, local vs. evolutionary, integrated
 Access patterns: update vs. read-only but complex queries
18
OLTP vs. OLAP
OLTP OLAP
users clerk, IT professional knowledge worker
function day to day operations decision support
DB design application-oriented subject-oriented
data current, up-to-date
detailed, flat relational
isolated
historical,
summarized, multidimensional
integrated, consolidated
usage repetitive ad-hoc
access read/write
index/hash on prim. key
lots of scans
unit of work short, simple transaction complex query
# records accessed tens millions
#users thousands hundreds
DB size 1
00MB-GB 1
00GB-TB
metric transaction throughput query throughput, response
Successfully implemented data warehouses can bring
benefits to an organization as below:
1. Organisational Performance: - Operational Systems and
Data Warehouse
2. Simplify - Make Complex Data from Many Systems
available in one
3. Accuracy - Standardize and Cleanse
4. Business Value - Provide the Foundation for the Business
to Have Access to Information to Make Timely, Informed
Decisions
5. Direct Use - Non-IT personnel can make reports
19
20
Typical data warehouse queries (Case study: Banking
industry)
• Which corporate customers are above the average
account usage per month and how does this correlate to
their business?
• Who were the first hundred customers in Jan 2006 and
how does this list compare with the list for the previous
three years?
• What is the revenue by destination, by month, by
business unit, by region?
Complexities of Creating a Data Warehouse
 Incomplete errors
 Missing Fields
 Records or Fields That, by Design, are not Being
Recorded
 Incorrect errors
 Wrong Calculations, Aggregations
 Duplicate Records
 Wrong Information Entered into Source
System
 Inconsistency errors
 Inconsistent Use of Different Codes
21
Best Practices
 Data Warehousing is a process and not a
project
 Complete requirements and design
 Prototyping is key to business
understanding
 Utilizing proper aggregations and detailed
data
 A full iterative approach is essential
 Training is an on-going process
 Build data integrity checks into your system
22
23
•High investment
•The initial cost of building a data warehouse is very high
and ROI cannot easily be explained.
• Large storage
•Data warehouse stores useful historical data of an
enterprise.
• Maintenance of source systems
• If data source systems are not cleaned, we automatically
get dirty data into the data warehouse. Decision makers
using such data are likely to be misled and their decisions
may lead to loss of company revenue.
• Qualified staff
•Data warehouse building and maintenance requires
24
• New insights into
• Customer habits
• Developing new products
• Selling more products
• Cost savings and revenue increases
• Cross-selling of products
• Identify and target most profitable customers
Conclusions
 Building data warehouse is good but not
sufficient. The data in a data warehouse has to
be accessed by users and in order to access it; a
BI tool has to be used.
25
 All processes, techniques, and tools that support
business decision making based on information
technology.
 The approaches can range from a simple excel
spreadsheet to a major competitive intelligence
undertaking.
 e.g. data visualization, data mining, statistical
analysis using R, SPSS, etc
What is Business Intelligence (BI)?
26
Discussion Questions
 The current trends in hardware/software technology make
data warehousing feasible. Explain via some examples how
exactly technology trends do help.
 You are the IT Director of a nationwide insurance company.
Write a memo to the Executive Vice President explaining the
types of opportunities that can be realized with readily
available strategic information.
 For an airlines company, how can strategic information
increase the number of frequent flyers? Discuss giving
specific details.
 You are a Senior Analyst in the IT department of a company
manufacturing automobile parts. The marketing VP is
complaining about the poor response by IT in providing
strategic information. Explain the reasons for the problems
and why a data warehouse would be the only viable solution.
27

More Related Content

Similar to Introduction to Business Intelligence and Data warehousing - ppt (20)

PDF
TOPIC 9 data warehousing and data mining.pdf
SCITprojects2022
 
PPT
Dataware housing
work
 
PPT
Data Warehousing Datamining Concepts
raulmisir
 
PPTX
DWDM Unit 1 (1).pptx
SalehaMariyam
 
PPT
Datawarehouse and OLAP
SAS SNDP YOGAM COLLEGE,KONNI
 
PPT
krithi-talk-impact.ppt
srirupadasgupta1
 
PPT
krithi-talk-impact.ppt
KRISHNARAJ207
 
PPTX
Data warehouse-complete-1-100227093028-phpapp01.pptx
ArunPatrick2
 
PPTX
dataWarehouse.pptx
hqlm1
 
PPTX
Data warehousing
Shruti Dalela
 
PPTX
BI LECTURE 3- 2023.pptx
AmanyaLaban
 
PPT
Ch1 data-warehousing
Ahmad Shlool
 
PPT
Ch1 data-warehousing
Ahmad Shlool
 
PPT
1-_Intro_to_Data_Minning__DWH.ppt
BsMath3rdsem
 
PPT
1.4 data warehouse
Krish_ver2
 
PPT
Yoyopresentasi 1225941108853502-8 2
Venkatesh Reddy
 
PPT
Data ware housing - Introduction to data ware housing process.
Vibrant Technologies & Computers
 
PPT
Data Warehouse-Final
Priyanka Manchanda ☁️
 
PDF
Data warehousing unit 1
WE-IT TUTORIALS
 
TOPIC 9 data warehousing and data mining.pdf
SCITprojects2022
 
Dataware housing
work
 
Data Warehousing Datamining Concepts
raulmisir
 
DWDM Unit 1 (1).pptx
SalehaMariyam
 
Datawarehouse and OLAP
SAS SNDP YOGAM COLLEGE,KONNI
 
krithi-talk-impact.ppt
srirupadasgupta1
 
krithi-talk-impact.ppt
KRISHNARAJ207
 
Data warehouse-complete-1-100227093028-phpapp01.pptx
ArunPatrick2
 
dataWarehouse.pptx
hqlm1
 
Data warehousing
Shruti Dalela
 
BI LECTURE 3- 2023.pptx
AmanyaLaban
 
Ch1 data-warehousing
Ahmad Shlool
 
Ch1 data-warehousing
Ahmad Shlool
 
1-_Intro_to_Data_Minning__DWH.ppt
BsMath3rdsem
 
1.4 data warehouse
Krish_ver2
 
Yoyopresentasi 1225941108853502-8 2
Venkatesh Reddy
 
Data ware housing - Introduction to data ware housing process.
Vibrant Technologies & Computers
 
Data Warehouse-Final
Priyanka Manchanda ☁️
 
Data warehousing unit 1
WE-IT TUTORIALS
 

Recently uploaded (20)

PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PDF
Persuasive AI: risks and opportunities in the age of digital debate
Speck&Tech
 
PDF
Why Orbit Edge Tech is a Top Next JS Development Company in 2025
mahendraalaska08
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
SFWelly Summer 25 Release Highlights July 2025
Anna Loughnan Colquhoun
 
PDF
TrustArc Webinar - Data Privacy Trends 2025: Mid-Year Insights & Program Stra...
TrustArc
 
PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
PDF
Wojciech Ciemski for Top Cyber News MAGAZINE. June 2025
Dr. Ludmila Morozova-Buss
 
PDF
Predicting the unpredictable: re-engineering recommendation algorithms for fr...
Speck&Tech
 
PDF
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
PDF
Complete Network Protection with Real-Time Security
L4RGINDIA
 
PPTX
MSP360 Backup Scheduling and Retention Best Practices.pptx
MSP360
 
PDF
Complete JavaScript Notes: From Basics to Advanced Concepts.pdf
haydendavispro
 
PDF
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
PDF
Blockchain Transactions Explained For Everyone
CIFDAQ
 
PPTX
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
PDF
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
PDF
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
PDF
July Patch Tuesday
Ivanti
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
Persuasive AI: risks and opportunities in the age of digital debate
Speck&Tech
 
Why Orbit Edge Tech is a Top Next JS Development Company in 2025
mahendraalaska08
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
SFWelly Summer 25 Release Highlights July 2025
Anna Loughnan Colquhoun
 
TrustArc Webinar - Data Privacy Trends 2025: Mid-Year Insights & Program Stra...
TrustArc
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
Wojciech Ciemski for Top Cyber News MAGAZINE. June 2025
Dr. Ludmila Morozova-Buss
 
Predicting the unpredictable: re-engineering recommendation algorithms for fr...
Speck&Tech
 
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
Complete Network Protection with Real-Time Security
L4RGINDIA
 
MSP360 Backup Scheduling and Retention Best Practices.pptx
MSP360
 
Complete JavaScript Notes: From Basics to Advanced Concepts.pdf
haydendavispro
 
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
Blockchain Transactions Explained For Everyone
CIFDAQ
 
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
July Patch Tuesday
Ivanti
 
Ad

Introduction to Business Intelligence and Data warehousing - ppt

  • 2. Course Outline  Week 1: Introduction to Business Intelligence , data warehouses and Corporate Information  Week 2: Data warehouse architectures  Week 3: Data warehouse lifecycle and project management  Week 4: Introduction to Dimensional Modeling  Week 5: Building dimensional models – tutorial session  Week 6: Test 1 and Data extraction, transformation and loading(ETL)  Week 7: Online analytical processing (OLAP)  Week 8: Data mining  Week 9: Data warehousing and the Web  Week 10: Data warehousing and the Web  Week 11: Matching information to users 2
  • 3. Introduction  Data is an enterprise’s most valuable asset. To fuel innovation, which fuels growth, an enterprise must:  Store every relevant data point about their business  Give data access to everyone who needs it  Have the ability to analyze the data in different ways  Distill the data down to insights 3
  • 4. Introduction  Data  Items that are the most elementary descriptions of things, events, activities, and transactions  May be internal or external  Information  Organized data that has meaning and value  Knowledge  Processed data or information that conveys understanding or learning applicable to a problem or activity 4
  • 5. Introduction  Over the years, storage and management of data from various operational systems has become a great challenge.  Long-term strategic planning has become increasingly important in the modern global market.  For this reason, companies have worked towards:  Access to information at all levels  Survival and prosperity in a competitive world.  The focus of technology shifted from data input and capture through the operational systems to information access and availability for strategic planning. 5
  • 6. 6  Executives and managers need information to formulate the business strategies, establish goals, set objectives, and monitor results.  Examples of business objectives include:  Retain the present customer base  Increase the customer base by 15% over the next 5 years  Gain market share by 10% in the next 3 years  Improve product quality levels in the top five product groups  Enhance customer service level in shipments  Bring three new products to market in 2 years  Increase sales by 15% in a given supermarket branch
  • 7. 7
  • 8. What is a Data Warehouse (DWH)?  A decision support database that is maintained separately from the organization’s operational database  Support information processing by providing a solid platform of consolidated, historical data for analysis.  “A data warehouse is a subject-oriented, integrated, time-variant, historical and non- volatile collection of data in support of management’s decision-making process.”— W. H. Inmon 8
  • 9. 9 Data Warehouse—Subject-Oriented  Organized around major subjects, such as customer, product, sales.  Focusing on the modeling and analysis of data for decision makers, not on daily operations or transaction processing.  Provide a simple and concise view around particular subject issues by excluding data that are not useful in the decision support process.
  • 10. 10 Data Warehouse—Integrated  Constructed by integrating multiple, heterogeneous data sources  relational databases, flat files, on-line transaction records  Data cleaning and data integration techniques are applied.  Ensure consistency in naming conventions, encoding structures, attribute measures, etc. among different data sources  E.g., Hotel price: currency, tax, breakfast covered, etc.  When data is moved to the warehouse, it is converted.
  • 11. 11 Data Warehouse—Time Variant  The time horizon for the data warehouse is significantly longer than that of operational systems.  Operational database: current value data.  Data warehouse data: provide information from a historical perspective (e.g., past 5-10 years)  Every key structure in the data warehouse  Contains an element of time, explicitly or implicitly  But the key of operational data may or may not contain “time element”.
  • 12. 12 Data Warehouse—Non-Volatile  A physically separate store of data transformed from the operational environment.  Operational update of data does not occur in the data warehouse environment.  Does not require transaction processing, recovery, and concurrency control mechanisms  Requires only two operations in data accessing:  initial loading of data and access of data.
  • 13. Basic concept of data warehousing  Take all the data from the operational systems  Where necessary, include relevant data from outside, such as industry benchmark indicators  Integrate all the data from the various sources  Remove inconsistencies and transform the data  Store the data in formats suitable for easy access for decision making 13
  • 14. A blend of technologies 14
  • 15. 15  Although a simple concept, it involves different functions: data extraction, the function of loading the data, transforming the data, storing the data, and providing user interfaces.  Different technologies are, therefore, needed to support these functions. Although many technologies are in use, they all work together in a data warehouse.  The end result is the creation of a new computing environment for the purpose of providing the strategic information every enterprise needs desperately.  There are several vendor tools available in each of these technologies. You do not have to build your data warehouse from scratch.
  • 16. 16 Data Warehouse vs. traditional integration in Heterogeneous DBMS  Traditional heterogeneous DB integration:  Build wrappers/mediators on top of heterogeneous databases  Query driven approach  When a query is posed to a client site, a meta-dictionary is used to translate the query into queries appropriate for individual heterogeneous sites involved, and the results are integrated into a global answer set  Complex information filtering, compete for resources  Data warehouse: update-driven, high performance  Information from heterogeneous sources is integrated in advance and stored in warehouses for direct query and analysis
  • 17. 17 Data Warehouse (OLAP) vs. Operational DBMS (OLTP)  OLTP (on-line transaction processing)  Major task of traditional relational DBMS  Day-to-day operations such as purchasing, inventory, banking, manufacturing, payroll, registration, accounting, etc.  OLAP (on-line analytical processing)  Major task of data warehouse system  Data analysis and decision making  Distinct features (OLTP vs. OLAP):  User and system orientation: customer vs. market  Data contents: current, detailed vs. historical, consolidated  Database design: ER + application vs. star + subject  View: current, local vs. evolutionary, integrated  Access patterns: update vs. read-only but complex queries
  • 18. 18 OLTP vs. OLAP OLTP OLAP users clerk, IT professional knowledge worker function day to day operations decision support DB design application-oriented subject-oriented data current, up-to-date detailed, flat relational isolated historical, summarized, multidimensional integrated, consolidated usage repetitive ad-hoc access read/write index/hash on prim. key lots of scans unit of work short, simple transaction complex query # records accessed tens millions #users thousands hundreds DB size 1 00MB-GB 1 00GB-TB metric transaction throughput query throughput, response
  • 19. Successfully implemented data warehouses can bring benefits to an organization as below: 1. Organisational Performance: - Operational Systems and Data Warehouse 2. Simplify - Make Complex Data from Many Systems available in one 3. Accuracy - Standardize and Cleanse 4. Business Value - Provide the Foundation for the Business to Have Access to Information to Make Timely, Informed Decisions 5. Direct Use - Non-IT personnel can make reports 19
  • 20. 20 Typical data warehouse queries (Case study: Banking industry) • Which corporate customers are above the average account usage per month and how does this correlate to their business? • Who were the first hundred customers in Jan 2006 and how does this list compare with the list for the previous three years? • What is the revenue by destination, by month, by business unit, by region?
  • 21. Complexities of Creating a Data Warehouse  Incomplete errors  Missing Fields  Records or Fields That, by Design, are not Being Recorded  Incorrect errors  Wrong Calculations, Aggregations  Duplicate Records  Wrong Information Entered into Source System  Inconsistency errors  Inconsistent Use of Different Codes 21
  • 22. Best Practices  Data Warehousing is a process and not a project  Complete requirements and design  Prototyping is key to business understanding  Utilizing proper aggregations and detailed data  A full iterative approach is essential  Training is an on-going process  Build data integrity checks into your system 22
  • 23. 23 •High investment •The initial cost of building a data warehouse is very high and ROI cannot easily be explained. • Large storage •Data warehouse stores useful historical data of an enterprise. • Maintenance of source systems • If data source systems are not cleaned, we automatically get dirty data into the data warehouse. Decision makers using such data are likely to be misled and their decisions may lead to loss of company revenue. • Qualified staff •Data warehouse building and maintenance requires
  • 24. 24 • New insights into • Customer habits • Developing new products • Selling more products • Cost savings and revenue increases • Cross-selling of products • Identify and target most profitable customers
  • 25. Conclusions  Building data warehouse is good but not sufficient. The data in a data warehouse has to be accessed by users and in order to access it; a BI tool has to be used. 25
  • 26.  All processes, techniques, and tools that support business decision making based on information technology.  The approaches can range from a simple excel spreadsheet to a major competitive intelligence undertaking.  e.g. data visualization, data mining, statistical analysis using R, SPSS, etc What is Business Intelligence (BI)? 26
  • 27. Discussion Questions  The current trends in hardware/software technology make data warehousing feasible. Explain via some examples how exactly technology trends do help.  You are the IT Director of a nationwide insurance company. Write a memo to the Executive Vice President explaining the types of opportunities that can be realized with readily available strategic information.  For an airlines company, how can strategic information increase the number of frequent flyers? Discuss giving specific details.  You are a Senior Analyst in the IT department of a company manufacturing automobile parts. The marketing VP is complaining about the poor response by IT in providing strategic information. Explain the reasons for the problems and why a data warehouse would be the only viable solution. 27

Editor's Notes