SlideShare a Scribd company logo
data mining
Technical considerations
What Is a Data Warehouse:
 Definition: A data warehouse is the data repository of
an enterprise. It is generally used for research and
decision support.
 By comparison: an OLTP (on-line transaction processor)
or operational system is used to deal with the everyday
running of one aspect of an enterprise.
 OLTP systems are usually designed independently of
each other and it is difficult for them to share
information.
Why Do We Need Data Warehouses
 Consolidation of information resources
 Improved query performance
 Separate research and decision support
functions from the operational systems
 Foundation for data mining, data
visualization, advanced reporting and
OLAP tools
Building a Data Warehouse
1. Business Considerations (Return on
Investment)
2. Design Considerations
3. Technical Considerations
4. Implementation Considerations
5. Integrated Solutions
6. Benefits of Data Warehousing
Technical Considerations
 A number of technical issues are to be considered when
designing and implementing a Data Warehouse environment.
1. The Hardware Platform that would house the Data
Warehouse for parallel query scalability. (Uni-
Processor, Multi-processor, etc)
2. The DBMS that supports the warehouse database
3. The communication infrastructure that connects the
warehouse, data marts, operational systems, and end users
4. The hardware platform and software to support the
metadata repository
5. The systems management framework that enables
centralized management and administration to the entire
environment.
HARDWAER PLATFORMS
Data warehouse implementations are
developed into already existing
environments.
This section looks at the hardware
platform selection from an architectural
viewpoint.
A mainframe system however,is not as
open and flexible as contemporary
client/server system,and is noy optimized
for hoc query proccessing.
In addition it has to be scalable,since the data
warehouse is never finished, as new user
requirements,new data sources,and more
historical datata are continusly incorrporated
into the warehouse.
Often the platform choice is the choice
between a mainframe and non-mvs(unix or
window nt)server.
BALANCED APPROACH
An important design point when selecting
a scalable computing platform is the right
balanced between all computing
components,for
Example between the number of
processors in a multiprocessors system
and the i/o bandwidth.remember that the
lack of balance in a system inevitabley
results in a bottleneck.
OPTIMAL HARDWARE ARCHITECTURE
FOR PARALLEL QUERY SCALABILLITY
An important consideration when selecting a
hardware platform for a data wareehouse is
that of scalabilty.
This architecture induced data skew is more
severe in the low-density asymmetric
connection architectures.
When selecting a hardware platform for a
data warehouse,take into account the fact
that the system a hardware platform for a
data skew can overpower even the best data
layout for parallel query.
data mining
data mining
data mining

More Related Content

What's hot (20)

PPTX
Aspects of data mart
Osama Hussain Paracha
 
PPTX
Data center architure ppts
RajuPrasad33
 
PPTX
Isas report
Tuấn Anh Nguyễn
 
PDF
Data mining
Anne Lee
 
PPTX
BUILDING A DATA WAREHOUSE
Neha Kapoor
 
PDF
8 crm data warehouse
ajitjoshiin
 
PPTX
Adbms and mmdbms
District Administration
 
PPTX
Are New Orleans Data Centers Making Green Strategies a Priority? (SlideShare)
SP Home Run Inc.
 
PPTX
DATA WAREHOUSING
Rishikese MR
 
PPTX
Data mining
Subha Selvam
 
PPT
DATA WAREHOUSING
Sejal Gaikwad
 
PPTX
Databases to improve business performance and decision making Client-server a...
Naveen Raj
 
PDF
Databases to improve business performance and decision making Client-server a...
Naveen Raj
 
PPT
Grid Asia2008 Low Latency Data Grid
Jags Ramnarayan
 
PDF
Let unified storage drive the change you need
Sandeep Mishra
 
PPT
Ch1 data-warehousing
Ahmad Shlool
 
PPT
Ch1 data-warehousing
Ahmad Shlool
 
PPTX
Data junction tool
Sara shall
 
PPTX
Teradata
SmArt Jia
 
Aspects of data mart
Osama Hussain Paracha
 
Data center architure ppts
RajuPrasad33
 
Isas report
Tuấn Anh Nguyễn
 
Data mining
Anne Lee
 
BUILDING A DATA WAREHOUSE
Neha Kapoor
 
8 crm data warehouse
ajitjoshiin
 
Adbms and mmdbms
District Administration
 
Are New Orleans Data Centers Making Green Strategies a Priority? (SlideShare)
SP Home Run Inc.
 
DATA WAREHOUSING
Rishikese MR
 
Data mining
Subha Selvam
 
DATA WAREHOUSING
Sejal Gaikwad
 
Databases to improve business performance and decision making Client-server a...
Naveen Raj
 
Databases to improve business performance and decision making Client-server a...
Naveen Raj
 
Grid Asia2008 Low Latency Data Grid
Jags Ramnarayan
 
Let unified storage drive the change you need
Sandeep Mishra
 
Ch1 data-warehousing
Ahmad Shlool
 
Ch1 data-warehousing
Ahmad Shlool
 
Data junction tool
Sara shall
 
Teradata
SmArt Jia
 

Similar to data mining (20)

DOC
Informatica and datawarehouse Material
obieefans
 
DOC
Data warehouse concepts
obieefans
 
DOCX
UNIT-5 DATA WAREHOUSING.docx
DURGADEVIL
 
PDF
DMDW 1st module.pdf
ShreyaBharadwaj7
 
PPT
E06WarehouseDesign.pptxkjhjkljhlkjhlkhlkj
ElyesAljane1
 
PPT
E06WarehouseDesignissuesindatawarehousedesign.ppt
deshpandeprajakta40
 
PDF
BI Chapter 03.pdf business business business business business business
JawaherAlbaddawi
 
PPT
DW 101
jeffd00
 
PDF
(Lecture 2)Data Warehouse Architecture.pdf
MobeenMasoudi
 
PDF
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
DATAVERSITY
 
PPTX
Designing modern dw and data lake
punedevscom
 
PPTX
UNIT 2 DATA WAREHOUSING AND DATA MINING PRESENTATION.pptx
shruthisweety4
 
PPT
Datawarehousing
sumit621
 
PPT
Data Warehouse
nayakslideshare
 
PPS
Data Warehouse 101
PanaEk Warawit
 
PDF
single store faster analytics for warehousing
ballsmcballsack
 
PPT
Datawarehousing & DSS
Deepali Raut
 
PDF
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
DATAVERSITY
 
PDF
Conspectus data warehousing appliances – fad or future
David Walker
 
PPT
Building High Performance MySql Query Systems And Analytic Applications
guest40cda0b
 
Informatica and datawarehouse Material
obieefans
 
Data warehouse concepts
obieefans
 
UNIT-5 DATA WAREHOUSING.docx
DURGADEVIL
 
DMDW 1st module.pdf
ShreyaBharadwaj7
 
E06WarehouseDesign.pptxkjhjkljhlkjhlkhlkj
ElyesAljane1
 
E06WarehouseDesignissuesindatawarehousedesign.ppt
deshpandeprajakta40
 
BI Chapter 03.pdf business business business business business business
JawaherAlbaddawi
 
DW 101
jeffd00
 
(Lecture 2)Data Warehouse Architecture.pdf
MobeenMasoudi
 
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
DATAVERSITY
 
Designing modern dw and data lake
punedevscom
 
UNIT 2 DATA WAREHOUSING AND DATA MINING PRESENTATION.pptx
shruthisweety4
 
Datawarehousing
sumit621
 
Data Warehouse
nayakslideshare
 
Data Warehouse 101
PanaEk Warawit
 
single store faster analytics for warehousing
ballsmcballsack
 
Datawarehousing & DSS
Deepali Raut
 
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
DATAVERSITY
 
Conspectus data warehousing appliances – fad or future
David Walker
 
Building High Performance MySql Query Systems And Analytic Applications
guest40cda0b
 
Ad

More from renukarenuka9 (20)

PPTX
mobile computing
renukarenuka9
 
PPTX
Dip
renukarenuka9
 
PPTX
Compiler design
renukarenuka9
 
PPTX
Web programming
renukarenuka9
 
PPTX
Software engineering
renukarenuka9
 
PPTX
Software engineering
renukarenuka9
 
PPTX
Software engineering
renukarenuka9
 
PPTX
Bigdata
renukarenuka9
 
PPTX
Bigdata ppt
renukarenuka9
 
PPTX
Rdbms
renukarenuka9
 
PPTX
Rdbms
renukarenuka9
 
PPTX
operating system
renukarenuka9
 
PPTX
Rdbms
renukarenuka9
 
PPTX
OPERATING SYSTEM
renukarenuka9
 
PPTX
Data mining
renukarenuka9
 
PPTX
Computer network
renukarenuka9
 
PPTX
computer network
renukarenuka9
 
PPTX
operating system
renukarenuka9
 
PPTX
data mining
renukarenuka9
 
PPTX
COMPUTER NETWORK
renukarenuka9
 
mobile computing
renukarenuka9
 
Compiler design
renukarenuka9
 
Web programming
renukarenuka9
 
Software engineering
renukarenuka9
 
Software engineering
renukarenuka9
 
Software engineering
renukarenuka9
 
Bigdata
renukarenuka9
 
Bigdata ppt
renukarenuka9
 
operating system
renukarenuka9
 
OPERATING SYSTEM
renukarenuka9
 
Data mining
renukarenuka9
 
Computer network
renukarenuka9
 
computer network
renukarenuka9
 
operating system
renukarenuka9
 
data mining
renukarenuka9
 
COMPUTER NETWORK
renukarenuka9
 
Ad

Recently uploaded (20)

PDF
NSF-DOE Vera C. Rubin Observatory Observations of Interstellar Comet 3I/ATLAS...
Sérgio Sacani
 
PPTX
Nanofertilizer: Its potential benefits and associated challenges.pptx
BikramjitDeuri
 
PPTX
MODIS/VIIRS Standard Cloud Products: SW Calibration and Trend Quantification ...
ShaneFernandes24
 
PPTX
Home Garden as a Component of Agroforestry system : A survey-based Study
AkhangshaRoy
 
PDF
High-definition imaging of a filamentary connection between a close quasar pa...
Sérgio Sacani
 
PPTX
ANTIANGINAL DRUGS.pptx m pharm pharmacology
46JaybhayAshwiniHari
 
PPTX
The Obesity Paradox. Friend or Foe ?pptx
drdgd1972
 
PPTX
Laboratory design and safe microbiological practices
Akanksha Divkar
 
PDF
Systems Biology: Integrating Engineering with Biological Research (www.kiu.a...
publication11
 
PPTX
Role of GIS in precision farming.pptx
BikramjitDeuri
 
PPT
1a. Basic Principles of Medical Microbiology Part 2 [Autosaved].ppt
separatedwalk
 
PDF
A deep Search for Ethylene Glycol and Glycolonitrile in the V883 Ori Protopla...
Sérgio Sacani
 
PDF
Multiwavelength Study of a Hyperluminous X-Ray Source near NGC6099: A Strong ...
Sérgio Sacani
 
PPT
1. Basic Principles of Medical Microbiology Part 1.ppt
separatedwalk
 
PPTX
Metabolismo de Purinas_2025_Luis Alvarez_Biomoleculas 2
Cinvestav
 
PPTX
Chromium (Cr) based oxidizing reagents.pptx
karnikhimani
 
PPTX
Pirimidinas_2025_Curso Ácidos nucleicos. Cinvestav
Cinvestav
 
PPTX
Feeding stratagey for climate change dairy animals.
Dr.Zulfy haq
 
PDF
Pulsar Sparking: What if mountains on the surface?
Sérgio Sacani
 
PDF
study of microbiologically influenced corrosion of 2205 duplex stainless stee...
ahmadfreak180
 
NSF-DOE Vera C. Rubin Observatory Observations of Interstellar Comet 3I/ATLAS...
Sérgio Sacani
 
Nanofertilizer: Its potential benefits and associated challenges.pptx
BikramjitDeuri
 
MODIS/VIIRS Standard Cloud Products: SW Calibration and Trend Quantification ...
ShaneFernandes24
 
Home Garden as a Component of Agroforestry system : A survey-based Study
AkhangshaRoy
 
High-definition imaging of a filamentary connection between a close quasar pa...
Sérgio Sacani
 
ANTIANGINAL DRUGS.pptx m pharm pharmacology
46JaybhayAshwiniHari
 
The Obesity Paradox. Friend or Foe ?pptx
drdgd1972
 
Laboratory design and safe microbiological practices
Akanksha Divkar
 
Systems Biology: Integrating Engineering with Biological Research (www.kiu.a...
publication11
 
Role of GIS in precision farming.pptx
BikramjitDeuri
 
1a. Basic Principles of Medical Microbiology Part 2 [Autosaved].ppt
separatedwalk
 
A deep Search for Ethylene Glycol and Glycolonitrile in the V883 Ori Protopla...
Sérgio Sacani
 
Multiwavelength Study of a Hyperluminous X-Ray Source near NGC6099: A Strong ...
Sérgio Sacani
 
1. Basic Principles of Medical Microbiology Part 1.ppt
separatedwalk
 
Metabolismo de Purinas_2025_Luis Alvarez_Biomoleculas 2
Cinvestav
 
Chromium (Cr) based oxidizing reagents.pptx
karnikhimani
 
Pirimidinas_2025_Curso Ácidos nucleicos. Cinvestav
Cinvestav
 
Feeding stratagey for climate change dairy animals.
Dr.Zulfy haq
 
Pulsar Sparking: What if mountains on the surface?
Sérgio Sacani
 
study of microbiologically influenced corrosion of 2205 duplex stainless stee...
ahmadfreak180
 

data mining

  • 2. Technical considerations What Is a Data Warehouse:  Definition: A data warehouse is the data repository of an enterprise. It is generally used for research and decision support.  By comparison: an OLTP (on-line transaction processor) or operational system is used to deal with the everyday running of one aspect of an enterprise.  OLTP systems are usually designed independently of each other and it is difficult for them to share information.
  • 3. Why Do We Need Data Warehouses  Consolidation of information resources  Improved query performance  Separate research and decision support functions from the operational systems  Foundation for data mining, data visualization, advanced reporting and OLAP tools
  • 4. Building a Data Warehouse 1. Business Considerations (Return on Investment) 2. Design Considerations 3. Technical Considerations 4. Implementation Considerations 5. Integrated Solutions 6. Benefits of Data Warehousing
  • 5. Technical Considerations  A number of technical issues are to be considered when designing and implementing a Data Warehouse environment. 1. The Hardware Platform that would house the Data Warehouse for parallel query scalability. (Uni- Processor, Multi-processor, etc) 2. The DBMS that supports the warehouse database 3. The communication infrastructure that connects the warehouse, data marts, operational systems, and end users 4. The hardware platform and software to support the metadata repository 5. The systems management framework that enables centralized management and administration to the entire environment.
  • 6. HARDWAER PLATFORMS Data warehouse implementations are developed into already existing environments. This section looks at the hardware platform selection from an architectural viewpoint. A mainframe system however,is not as open and flexible as contemporary client/server system,and is noy optimized for hoc query proccessing.
  • 7. In addition it has to be scalable,since the data warehouse is never finished, as new user requirements,new data sources,and more historical datata are continusly incorrporated into the warehouse. Often the platform choice is the choice between a mainframe and non-mvs(unix or window nt)server.
  • 8. BALANCED APPROACH An important design point when selecting a scalable computing platform is the right balanced between all computing components,for Example between the number of processors in a multiprocessors system and the i/o bandwidth.remember that the lack of balance in a system inevitabley results in a bottleneck.
  • 9. OPTIMAL HARDWARE ARCHITECTURE FOR PARALLEL QUERY SCALABILLITY An important consideration when selecting a hardware platform for a data wareehouse is that of scalabilty. This architecture induced data skew is more severe in the low-density asymmetric connection architectures. When selecting a hardware platform for a data warehouse,take into account the fact that the system a hardware platform for a data skew can overpower even the best data layout for parallel query.