SlideShare a Scribd company logo
Data Storage and Basic File
Structure
Ms. Amrit Kaur
4/29/2021 1:05 PM
• Databases consist of large amount of data that
are stored permanently on magnetic disk.
• Database applications need only a small
portion of database at a time for processing.
– Data from the disk is copied to main memory for
processing and rewritten to the disk if the data is
changed.
4/29/2021 1:05 PM
Data Files
• The data on the disk is physically stored as
files of records.
• A data file is a sequence of records
4/29/2021 1:05 PM
Records and Record Types
• A record is a collection of related data values
or items that corresponds to a particular field.
– Record describes a particular entity, their
attributes, and their relationships.
• Types of Records
– Fixed length records
– Variable length records
4/29/2021 1:05 PM
Records and Record Types
• Fixed length record
– When ALL record in a file has exactly the same size in
bytes
– Every record has same fields and field lengths are
fixed.
– Example:
• CREATE TABLE student
(rno char(3),
name char(15),
city char (15));
1 char occupies 1 bytes
Total Record Size = 3 + 15+ 15 = 33 bytes
4/29/2021 1:05 PM
1.. Amrit.......... Delhi………. 33
2.. Dj…………. Chennai…….. 33
12. Jaspreet……. Goa………… 33
123 Jasmeet…….. Delhi………. 33
3 bytes 15 bytes
15 bytes
Records and Record Types
• Variable length record
– When different records in the file have different
sizes
– Example:
• CREATE TABLE student
(rno varchar(3),
name varchar(15),
city varchar (15));
4/29/2021 1:05 PM
1 Amrit Delhi 11
2 Dj Chennai 10
12 Jaspreet Goa 13
123 Jasmeet Delhi 15
Record and Record Types
• Reasons of having variable length records
– Record types
• that allow variable length for one or more fields.
• One or more fields are optional
– File having records of different record types
– One or more fields have multiple values for
individual records
4/29/2021 1:05 PM
FILE ORGANIZATION
4/29/2021 1:05 PM
What is File Organization?
• A file organization simply means organization
of records in files.
• A file organization is defined as a technique to
determine
– how the file records are physically arranged on the
disk and
– how the records can be accessed
4/29/2021 1:05 PM
Need of File Organization
• Fast data retrieval
• Efficient use of storage space
• Protection from failure or data loss
• Minimizing need for reorganization
• Security from unauthorized user
4/29/2021 1:05 PM
Types of File Organization
• Heap File Organization
• Sequential File Organization
• Indexed File Organization
• Hashing File Organization
4/29/2021 1:05 PM
Heap File Organization
4/29/2021 1:05 PM
• Records (data) is stored in the file in the order in
which they are inserted
217 Sita Delhi
101 Ramesh Chennai
215 Gita Chennai
102 Mina Mumbai
201 Suresh Delhi
218 Mina Chennai
222 Ram Chennai
305 Robin Mumbai
220 Amrit Delhi
Student (RollNumber, Name, City)
Heap File Organization
• Also called pile file or Non Sequential
Organization .
• Operations
– Insertion at the end of the file, so very efficient
– Retrieval in order of the values of field requires external sorting.
– Searching involves Linear search through a file, so searching is
slow
– Deletion leaves unused space and requires periodic
reorganization…time conmunsimg and not effective
4/29/2021 1:05 PM
Sequential Data File
4/29/2021 1:05 PM
• A records(data) in the file are stored in sequence
according to the value of search key and / or primary
key of each record.
101 Ramesh Chennai
201 Suresh Delhi
210 Joy Mumabi
215 Gita Chennai
217 Sita Delhi
218 Mina Chennai
222 Ram Chennai
305 Robin Mumbai
Student (RollNumber, Name, City)
Sequential File Organization
• Operations
– Retrieval is efficient because no sorting is required
– Searching involves Binary search through a file, so
moderate speed
– Insertion and deletion are expensive and time
consuming because requires reordering and
rewriting
4/29/2021 1:05 PM
Indexed File Organization
• Two files
– Data File: table data (.myd)
– Index File: index of data (.myi)
4/29/2021 1:05 PM
What it is?
• In data file, records are stored either
sequentially or non sequentially and
• Index File is created that allow application to
locate individual records.
4/29/2021 1:05 PM
What is Index?
• An index is a table used to determine the location of
records in a file.
• Index speed up the retrieval of records w.r.t. search
conditions.
• Any field (column) of the file can be used to create an
index and known as index field.
• Multiple indexes on different fields can be constructed
4/29/2021 1:05 PM
…. Contd…
• Types of Index
– Ordered indices
• Index file is sorted in order of index field
– Hash indices
• Based on uniform distribution of values determined by
function called hash function.
4/29/2021 1:05 PM
Indexing Methods Based on Ordering
• Primary Index
• Clustering Index
• Secondary Index
• Dense Index
• Sparse Index
4/29/2021 1:05 PM
How Index are stored?
• Ordered File with two fields (Key, Pointer)
– First Field (Key) : value of field used for indexing
– Second Field: A block or record pointer
4/29/2021 1:05 PM
Primary Index
• When the ordering of a file is on field which
have a unique value of each record, the index
is known as primary index.
• Primary Index can be characterized as
– Dense
– Sparse
4/29/2021 1:05 PM
Clustering Index
• When the ordering of a file is on field which does
not have a distinct value of each record, the index
is known as clustered index.
• It is also a non dense index.
• When you create a table with a primary key or
unique key, automatically creates a special index
named PRIMARY. This index is called the clustered
index.
4/29/2021 1:05 PM
Secondary Index
• May be on the field which is a candidate key
or a non key with duplicate values
• There can be many secondary indexes for the
same file.
• It is a dense index.
4/29/2021 1:05 PM
Primary Index ….contd…
• A DENSE INDEX has an index entry for every
search key value (every record)
4/29/2021 1:05 PM
Primary Index ….contd…
• A SPARSE INDEX (nondense) has entries for
only some of the search values.
4/29/2021 1:05 PM
Problems with simple ordered indexes
that are kept in disk
• Searching the index is still not fast (binary
searching):
– We do not want more than 3 to 4 comparisons
for a search
• Insertions and deletions of index is expensive
– Index file is sorted
4/29/2021 1:05 PM
SOLUTION
• Multilevel Indexing
4/29/2021 1:05 PM
Multilevel Indexing
• Creating an index of an index file is called
multilevel indexing.
• How?
– Build a simple index for the file, as a sorted file with a
distinct value for each key (First or Base Level)
– Build an primary index for this index
– Build another index for the previous index
– Continue the index-building process until we get
single block called the top index level
4/29/2021 1:05 PM
4/29/2021 1:05 PM
… contd…
• Multilevel indexing is implemented using a
variation of the B tree data structure, called a
B+ tree
4/29/2021 1:05 PM
Example B+Tree
4/29/2021 1:05 PM
Hashed File Organization
4/29/2021 1:05 PM
What it is?
• In a hashed file organization, address of each
record is determined using hashing algorithm.
• Provide a function h, called a hash function,
which is applied to the hash field value (key)
of a record and computes the address of the
disk block (BUCKET)in which the record is
stored.
4/29/2021 1:05 PM
Types of Hashing
• Static Hashing
• Dynamic Hashing
4/29/2021 1:05 PM
Static Hashing
• Uses hash functions in which the set of bucket
address is fixed.
• Hashing Function
– Division Method
– Mid Square Method
– Folding Method etc
4/29/2021 1:05 PM
Collision Resolution
• A collision occurs when the hash field value of
a new record that is being inserted hashes to
an address that already contains a different
record.
• The process of finding another position is
called collision resolution.
4/29/2021 1:05 PM
How Hashing is done?
4/29/2021 1:05 PM
Dynamic Hashing
• Some hashing techniques allow the hash
function to be modified dynamically to
accommodate the growth or shrinkage of the
database.
4/29/2021 1:05 PM
Extendable Hashing
• We choose a hash function that is uniform and
random. It generates values over a relatively
large range.
• The hash addresses in the address space (i.e.
the range) are represented by d-bit binary
integers (typically d = 32). As a result, we can
have a maximum of 232 (over 4 billion)
buckets.
4/29/2021 1:05 PM
• Buckets are not created buckets at once.
• Create them on demand, depending on the size
of the file.
• According to the actual number of buckets
created, we use the corresponding number of
bits to represent their address.
• For example, if there are four buckets at the if
there are four buckets at the moment, we just
need 2 bits for the addresses (i.e. 00, 01, 10 and
11).
4/29/2021 1:05 PM

More Related Content

PPT
File organisation
Mukund Trivedi
 
PPT
File organization
Gokul017
 
PPTX
File Organization
Manyi Man
 
PPT
Lecture #1 Introduction
Rico
 
PPT
File organization
Ganesh Pawar
 
PDF
File organisation
Suneel Dogra
 
PPTX
Ch 17 disk storage, basic files structure, and hashing
Zainab Almugbel
 
File organisation
Mukund Trivedi
 
File organization
Gokul017
 
File Organization
Manyi Man
 
Lecture #1 Introduction
Rico
 
File organization
Ganesh Pawar
 
File organisation
Suneel Dogra
 
Ch 17 disk storage, basic files structure, and hashing
Zainab Almugbel
 

What's hot (20)

PPT
File organisation
Samuel Igbanogu
 
PPTX
Data base
maha yasin
 
PPTX
Concept of computer files
Samuel Igbanogu
 
PDF
itft-File design
Shifali Sharma
 
PPT
Fileorganization AbnMagdy
Mohamed Magdy
 
PPT
File organization and indexing
raveena sharma
 
PDF
File organization
KanchanPatil34
 
PPT
Fundamental File Processing Operations
Rico
 
PPT
File Management
Ramasubbu .P
 
PPTX
Examining Linux File Structures
primeteacher32
 
PPTX
Report blocking ,management of files in secondry memory , static vs dynamic a...
NoorMustafaSoomro
 
PDF
Chap01 (ics12)
usmanahmadawan
 
PPT
File structures
Shyam Kumar
 
PDF
File Types in Data Structure
Prof Ansari
 
PDF
Lecture storage-buffer
Klaas Krona
 
PPTX
6 chapter 6 record storage and primary file organization
siragezeynu
 
PPTX
Chapter 3
Cahaya Penyayang
 
PPT
Ie Storage, Multimedia And File Organization
MISY
 
PDF
Microsoft power point chapter 5 file edited
Linga Lgs
 
PPTX
Handling computer files
Samuel Igbanogu
 
File organisation
Samuel Igbanogu
 
Data base
maha yasin
 
Concept of computer files
Samuel Igbanogu
 
itft-File design
Shifali Sharma
 
Fileorganization AbnMagdy
Mohamed Magdy
 
File organization and indexing
raveena sharma
 
File organization
KanchanPatil34
 
Fundamental File Processing Operations
Rico
 
File Management
Ramasubbu .P
 
Examining Linux File Structures
primeteacher32
 
Report blocking ,management of files in secondry memory , static vs dynamic a...
NoorMustafaSoomro
 
Chap01 (ics12)
usmanahmadawan
 
File structures
Shyam Kumar
 
File Types in Data Structure
Prof Ansari
 
Lecture storage-buffer
Klaas Krona
 
6 chapter 6 record storage and primary file organization
siragezeynu
 
Chapter 3
Cahaya Penyayang
 
Ie Storage, Multimedia And File Organization
MISY
 
Microsoft power point chapter 5 file edited
Linga Lgs
 
Handling computer files
Samuel Igbanogu
 
Ad

Similar to File Organization (20)

PPT
File organization 1
Rupali Rana
 
PPTX
files,indexing,hashing,linear and non linear hashing
Rohit Kumar
 
PPTX
lecture 2 notes indexing in application of database systems.pptx
peter1097
 
PPTX
normalization process in relational data base management
ssuserf80a8c
 
PPT
Data Indexing Presentation-My.pptppt.ppt
sdsm2
 
PPTX
File organization and introduction of DBMS
VrushaliSolanke
 
PDF
fileorganizationandintroductionofdbms-210313163900.pdf
FraolUmeta
 
PPTX
file organization ppt on dbms types of f
ar1289589
 
PPTX
Relational database management system file organisation.pptx
rathoreravindra2112
 
PPTX
File Structure.pptx
zedd15
 
PPTX
File Organization, Indexing and Hashing.pptx
niqqaanonymous211
 
PPTX
DBMS (UNIT 5)
Dr. SURBHI SAROHA
 
PPT
INDEXING METHODS USED IN DATABASE STORAGE
polin38
 
PPT
StorageIndexing_CS541.ppt indexes for dtata bae
syedalishahid6
 
PPT
StorageIndexing_Main memory (RAM) for currently used data. Disk for the main ...
masooda5
 
PPTX
Relational Database Management System
sweetysweety8
 
PPTX
DBMS-Unit5-PPT.pptx important for revision
yuvivarmaa
 
PPT
Unit 4 data storage and querying
Ravindran Kannan
 
PPTX
Overview of Storage and Indexing ...
Javed Khan
 
PDF
5 data storage_and_indexing
Utkarsh De
 
File organization 1
Rupali Rana
 
files,indexing,hashing,linear and non linear hashing
Rohit Kumar
 
lecture 2 notes indexing in application of database systems.pptx
peter1097
 
normalization process in relational data base management
ssuserf80a8c
 
Data Indexing Presentation-My.pptppt.ppt
sdsm2
 
File organization and introduction of DBMS
VrushaliSolanke
 
fileorganizationandintroductionofdbms-210313163900.pdf
FraolUmeta
 
file organization ppt on dbms types of f
ar1289589
 
Relational database management system file organisation.pptx
rathoreravindra2112
 
File Structure.pptx
zedd15
 
File Organization, Indexing and Hashing.pptx
niqqaanonymous211
 
DBMS (UNIT 5)
Dr. SURBHI SAROHA
 
INDEXING METHODS USED IN DATABASE STORAGE
polin38
 
StorageIndexing_CS541.ppt indexes for dtata bae
syedalishahid6
 
StorageIndexing_Main memory (RAM) for currently used data. Disk for the main ...
masooda5
 
Relational Database Management System
sweetysweety8
 
DBMS-Unit5-PPT.pptx important for revision
yuvivarmaa
 
Unit 4 data storage and querying
Ravindran Kannan
 
Overview of Storage and Indexing ...
Javed Khan
 
5 data storage_and_indexing
Utkarsh De
 
Ad

More from Amrit Kaur (20)

PDF
Introduction to transaction processing
Amrit Kaur
 
PDF
ER diagram
Amrit Kaur
 
PPTX
Transaction Processing
Amrit Kaur
 
PDF
Normalization
Amrit Kaur
 
PDF
Sample Interview Question
Amrit Kaur
 
PPTX
12. oracle database architecture
Amrit Kaur
 
PPTX
11. using regular expressions with oracle database
Amrit Kaur
 
PPTX
10. timestamp
Amrit Kaur
 
PPTX
9. index and index organized table
Amrit Kaur
 
PPTX
8. transactions
Amrit Kaur
 
PPTX
7. exceptions handling in pl
Amrit Kaur
 
PPTX
6. triggers
Amrit Kaur
 
PPTX
5. stored procedure and functions
Amrit Kaur
 
PPTX
4. plsql
Amrit Kaur
 
PPTX
3. ddl create
Amrit Kaur
 
PPTX
2. DML_INSERT_DELETE_UPDATE
Amrit Kaur
 
PPTX
1. dml select statement reterive data
Amrit Kaur
 
PDF
Chapter 8 Inheritance
Amrit Kaur
 
PDF
Chapter 7 C++ As OOP
Amrit Kaur
 
PDF
Chapter 6 OOPS Concept
Amrit Kaur
 
Introduction to transaction processing
Amrit Kaur
 
ER diagram
Amrit Kaur
 
Transaction Processing
Amrit Kaur
 
Normalization
Amrit Kaur
 
Sample Interview Question
Amrit Kaur
 
12. oracle database architecture
Amrit Kaur
 
11. using regular expressions with oracle database
Amrit Kaur
 
10. timestamp
Amrit Kaur
 
9. index and index organized table
Amrit Kaur
 
8. transactions
Amrit Kaur
 
7. exceptions handling in pl
Amrit Kaur
 
6. triggers
Amrit Kaur
 
5. stored procedure and functions
Amrit Kaur
 
4. plsql
Amrit Kaur
 
3. ddl create
Amrit Kaur
 
2. DML_INSERT_DELETE_UPDATE
Amrit Kaur
 
1. dml select statement reterive data
Amrit Kaur
 
Chapter 8 Inheritance
Amrit Kaur
 
Chapter 7 C++ As OOP
Amrit Kaur
 
Chapter 6 OOPS Concept
Amrit Kaur
 

Recently uploaded (20)

PPTX
INTESTINALPARASITES OR WORM INFESTATIONS.pptx
PRADEEP ABOTHU
 
PPTX
20250924 Navigating the Future: How to tell the difference between an emergen...
McGuinness Institute
 
PDF
Virat Kohli- the Pride of Indian cricket
kushpar147
 
PPTX
An introduction to Prepositions for beginners.pptx
drsiddhantnagine
 
PPTX
TEF & EA Bsc Nursing 5th sem.....BBBpptx
AneetaSharma15
 
PPTX
Care of patients with elImination deviation.pptx
AneetaSharma15
 
PPTX
Cleaning Validation Ppt Pharmaceutical validation
Ms. Ashatai Patil
 
PDF
The Minister of Tourism, Culture and Creative Arts, Abla Dzifa Gomashie has e...
nservice241
 
PDF
2.Reshaping-Indias-Political-Map.ppt/pdf/8th class social science Exploring S...
Sandeep Swamy
 
PPTX
CONCEPT OF CHILD CARE. pptx
AneetaSharma15
 
PPTX
How to Close Subscription in Odoo 18 - Odoo Slides
Celine George
 
PPTX
Five Point Someone – Chetan Bhagat | Book Summary & Analysis by Bhupesh Kushwaha
Bhupesh Kushwaha
 
PDF
Biological Classification Class 11th NCERT CBSE NEET.pdf
NehaRohtagi1
 
PDF
BÀI TẬP TEST BỔ TRỢ THEO TỪNG CHỦ ĐỀ CỦA TỪNG UNIT KÈM BÀI TẬP NGHE - TIẾNG A...
Nguyen Thanh Tu Collection
 
PPTX
Measures_of_location_-_Averages_and__percentiles_by_DR SURYA K.pptx
Surya Ganesh
 
PPTX
CDH. pptx
AneetaSharma15
 
PDF
Review of Related Literature & Studies.pdf
Thelma Villaflores
 
PPTX
Information Texts_Infographic on Forgetting Curve.pptx
Tata Sevilla
 
PPTX
Dakar Framework Education For All- 2000(Act)
santoshmohalik1
 
PPTX
family health care settings home visit - unit 6 - chn 1 - gnm 1st year.pptx
Priyanshu Anand
 
INTESTINALPARASITES OR WORM INFESTATIONS.pptx
PRADEEP ABOTHU
 
20250924 Navigating the Future: How to tell the difference between an emergen...
McGuinness Institute
 
Virat Kohli- the Pride of Indian cricket
kushpar147
 
An introduction to Prepositions for beginners.pptx
drsiddhantnagine
 
TEF & EA Bsc Nursing 5th sem.....BBBpptx
AneetaSharma15
 
Care of patients with elImination deviation.pptx
AneetaSharma15
 
Cleaning Validation Ppt Pharmaceutical validation
Ms. Ashatai Patil
 
The Minister of Tourism, Culture and Creative Arts, Abla Dzifa Gomashie has e...
nservice241
 
2.Reshaping-Indias-Political-Map.ppt/pdf/8th class social science Exploring S...
Sandeep Swamy
 
CONCEPT OF CHILD CARE. pptx
AneetaSharma15
 
How to Close Subscription in Odoo 18 - Odoo Slides
Celine George
 
Five Point Someone – Chetan Bhagat | Book Summary & Analysis by Bhupesh Kushwaha
Bhupesh Kushwaha
 
Biological Classification Class 11th NCERT CBSE NEET.pdf
NehaRohtagi1
 
BÀI TẬP TEST BỔ TRỢ THEO TỪNG CHỦ ĐỀ CỦA TỪNG UNIT KÈM BÀI TẬP NGHE - TIẾNG A...
Nguyen Thanh Tu Collection
 
Measures_of_location_-_Averages_and__percentiles_by_DR SURYA K.pptx
Surya Ganesh
 
CDH. pptx
AneetaSharma15
 
Review of Related Literature & Studies.pdf
Thelma Villaflores
 
Information Texts_Infographic on Forgetting Curve.pptx
Tata Sevilla
 
Dakar Framework Education For All- 2000(Act)
santoshmohalik1
 
family health care settings home visit - unit 6 - chn 1 - gnm 1st year.pptx
Priyanshu Anand
 

File Organization

  • 1. Data Storage and Basic File Structure Ms. Amrit Kaur 4/29/2021 1:05 PM
  • 2. • Databases consist of large amount of data that are stored permanently on magnetic disk. • Database applications need only a small portion of database at a time for processing. – Data from the disk is copied to main memory for processing and rewritten to the disk if the data is changed. 4/29/2021 1:05 PM
  • 3. Data Files • The data on the disk is physically stored as files of records. • A data file is a sequence of records 4/29/2021 1:05 PM
  • 4. Records and Record Types • A record is a collection of related data values or items that corresponds to a particular field. – Record describes a particular entity, their attributes, and their relationships. • Types of Records – Fixed length records – Variable length records 4/29/2021 1:05 PM
  • 5. Records and Record Types • Fixed length record – When ALL record in a file has exactly the same size in bytes – Every record has same fields and field lengths are fixed. – Example: • CREATE TABLE student (rno char(3), name char(15), city char (15)); 1 char occupies 1 bytes Total Record Size = 3 + 15+ 15 = 33 bytes 4/29/2021 1:05 PM 1.. Amrit.......... Delhi………. 33 2.. Dj…………. Chennai…….. 33 12. Jaspreet……. Goa………… 33 123 Jasmeet…….. Delhi………. 33 3 bytes 15 bytes 15 bytes
  • 6. Records and Record Types • Variable length record – When different records in the file have different sizes – Example: • CREATE TABLE student (rno varchar(3), name varchar(15), city varchar (15)); 4/29/2021 1:05 PM 1 Amrit Delhi 11 2 Dj Chennai 10 12 Jaspreet Goa 13 123 Jasmeet Delhi 15
  • 7. Record and Record Types • Reasons of having variable length records – Record types • that allow variable length for one or more fields. • One or more fields are optional – File having records of different record types – One or more fields have multiple values for individual records 4/29/2021 1:05 PM
  • 9. What is File Organization? • A file organization simply means organization of records in files. • A file organization is defined as a technique to determine – how the file records are physically arranged on the disk and – how the records can be accessed 4/29/2021 1:05 PM
  • 10. Need of File Organization • Fast data retrieval • Efficient use of storage space • Protection from failure or data loss • Minimizing need for reorganization • Security from unauthorized user 4/29/2021 1:05 PM
  • 11. Types of File Organization • Heap File Organization • Sequential File Organization • Indexed File Organization • Hashing File Organization 4/29/2021 1:05 PM
  • 12. Heap File Organization 4/29/2021 1:05 PM • Records (data) is stored in the file in the order in which they are inserted 217 Sita Delhi 101 Ramesh Chennai 215 Gita Chennai 102 Mina Mumbai 201 Suresh Delhi 218 Mina Chennai 222 Ram Chennai 305 Robin Mumbai 220 Amrit Delhi Student (RollNumber, Name, City)
  • 13. Heap File Organization • Also called pile file or Non Sequential Organization . • Operations – Insertion at the end of the file, so very efficient – Retrieval in order of the values of field requires external sorting. – Searching involves Linear search through a file, so searching is slow – Deletion leaves unused space and requires periodic reorganization…time conmunsimg and not effective 4/29/2021 1:05 PM
  • 14. Sequential Data File 4/29/2021 1:05 PM • A records(data) in the file are stored in sequence according to the value of search key and / or primary key of each record. 101 Ramesh Chennai 201 Suresh Delhi 210 Joy Mumabi 215 Gita Chennai 217 Sita Delhi 218 Mina Chennai 222 Ram Chennai 305 Robin Mumbai Student (RollNumber, Name, City)
  • 15. Sequential File Organization • Operations – Retrieval is efficient because no sorting is required – Searching involves Binary search through a file, so moderate speed – Insertion and deletion are expensive and time consuming because requires reordering and rewriting 4/29/2021 1:05 PM
  • 16. Indexed File Organization • Two files – Data File: table data (.myd) – Index File: index of data (.myi) 4/29/2021 1:05 PM
  • 17. What it is? • In data file, records are stored either sequentially or non sequentially and • Index File is created that allow application to locate individual records. 4/29/2021 1:05 PM
  • 18. What is Index? • An index is a table used to determine the location of records in a file. • Index speed up the retrieval of records w.r.t. search conditions. • Any field (column) of the file can be used to create an index and known as index field. • Multiple indexes on different fields can be constructed 4/29/2021 1:05 PM
  • 19. …. Contd… • Types of Index – Ordered indices • Index file is sorted in order of index field – Hash indices • Based on uniform distribution of values determined by function called hash function. 4/29/2021 1:05 PM
  • 20. Indexing Methods Based on Ordering • Primary Index • Clustering Index • Secondary Index • Dense Index • Sparse Index 4/29/2021 1:05 PM
  • 21. How Index are stored? • Ordered File with two fields (Key, Pointer) – First Field (Key) : value of field used for indexing – Second Field: A block or record pointer 4/29/2021 1:05 PM
  • 22. Primary Index • When the ordering of a file is on field which have a unique value of each record, the index is known as primary index. • Primary Index can be characterized as – Dense – Sparse 4/29/2021 1:05 PM
  • 23. Clustering Index • When the ordering of a file is on field which does not have a distinct value of each record, the index is known as clustered index. • It is also a non dense index. • When you create a table with a primary key or unique key, automatically creates a special index named PRIMARY. This index is called the clustered index. 4/29/2021 1:05 PM
  • 24. Secondary Index • May be on the field which is a candidate key or a non key with duplicate values • There can be many secondary indexes for the same file. • It is a dense index. 4/29/2021 1:05 PM
  • 25. Primary Index ….contd… • A DENSE INDEX has an index entry for every search key value (every record) 4/29/2021 1:05 PM
  • 26. Primary Index ….contd… • A SPARSE INDEX (nondense) has entries for only some of the search values. 4/29/2021 1:05 PM
  • 27. Problems with simple ordered indexes that are kept in disk • Searching the index is still not fast (binary searching): – We do not want more than 3 to 4 comparisons for a search • Insertions and deletions of index is expensive – Index file is sorted 4/29/2021 1:05 PM
  • 29. Multilevel Indexing • Creating an index of an index file is called multilevel indexing. • How? – Build a simple index for the file, as a sorted file with a distinct value for each key (First or Base Level) – Build an primary index for this index – Build another index for the previous index – Continue the index-building process until we get single block called the top index level 4/29/2021 1:05 PM
  • 31. … contd… • Multilevel indexing is implemented using a variation of the B tree data structure, called a B+ tree 4/29/2021 1:05 PM
  • 34. What it is? • In a hashed file organization, address of each record is determined using hashing algorithm. • Provide a function h, called a hash function, which is applied to the hash field value (key) of a record and computes the address of the disk block (BUCKET)in which the record is stored. 4/29/2021 1:05 PM
  • 35. Types of Hashing • Static Hashing • Dynamic Hashing 4/29/2021 1:05 PM
  • 36. Static Hashing • Uses hash functions in which the set of bucket address is fixed. • Hashing Function – Division Method – Mid Square Method – Folding Method etc 4/29/2021 1:05 PM
  • 37. Collision Resolution • A collision occurs when the hash field value of a new record that is being inserted hashes to an address that already contains a different record. • The process of finding another position is called collision resolution. 4/29/2021 1:05 PM
  • 38. How Hashing is done? 4/29/2021 1:05 PM
  • 39. Dynamic Hashing • Some hashing techniques allow the hash function to be modified dynamically to accommodate the growth or shrinkage of the database. 4/29/2021 1:05 PM
  • 40. Extendable Hashing • We choose a hash function that is uniform and random. It generates values over a relatively large range. • The hash addresses in the address space (i.e. the range) are represented by d-bit binary integers (typically d = 32). As a result, we can have a maximum of 232 (over 4 billion) buckets. 4/29/2021 1:05 PM
  • 41. • Buckets are not created buckets at once. • Create them on demand, depending on the size of the file. • According to the actual number of buckets created, we use the corresponding number of bits to represent their address. • For example, if there are four buckets at the if there are four buckets at the moment, we just need 2 bits for the addresses (i.e. 00, 01, 10 and 11). 4/29/2021 1:05 PM