Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
1
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
2
 Use of hashing techniques that support very fast retrieval
via a key
 Factors that affect the performance of hashing
 Collision resolution strategies
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
3
 Hashing is finding an address where the data is to be
stored as well as located using a key with the help of the
algorithmic function
 Hashing is a method of directly computing the address of
the record with the help of a key by using a suitable
mathematical function called the hash function
 A hash table is an array-based structure used to store <key,
information> pairs
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
4
key
Hash(key) Address
Fig 11.1:Hashing Concept
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
5
 The resulting address is used as the basis for storing and
retrieving records and this address is called as home
address of the record
 For array to store a record in a hash table, hash function is
applied to the key of the record being stored, returning an
index within the range of the hash table
 The item is then stored in the table of that index position
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
6
 1. With hashing, the address generated appears to be
random—there is no immediately obvious connection
between the key and the location of the corresponding
record, even though the key is used to determine the
location of the record. For this reason, hashing is
sometimes referred to as randomizing
 2. With hashing, two different keys may be
transformed to the same address, so two records may be
sent to the same place in the file. When this occurs, it is
called collision and some means must be found to deal
with it. The two or more records that result in the
samehome address are called as synonyms
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
7
 A problem arises, however, when the hash function
returns the same value when applied to two different keys
 To handle the situation, where two records need to be
hashed to the same address we can implement a table
structure, so as to have a room for two or more members at
the same index positions
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
8
 A function that maps a key into the range [0 to Max − 1],
the result of which is used as an index (or address) to hash
table for storing and retrieving record
 The address generated by hashing function is called as
home address
 All home addresses address to particular area of memory
and that area is called as prime area
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
9
 Bucket is an index position in hash table that can store
more than one record
 When the same index is mapped with two keys, then both
the records are stored in the same bucket
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
10
 The result of two keys hashing into the same address is
called collision
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
11
 Keys those hash to the same address are called synonyms
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
12
 The result of more keys hashing to the same address and if
there is no room in the bucket, then it is said that
overflow has occurred
 Collision and overflow are synonymous when the bucket is
of size 1
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
13
 When we allow records to be stored in potentially
unlimited space, it is called as open or external
hashing
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
14
 When we use fixed space for storage eventually limiting
the number of records to be stored, it is called as closed or
internal hashing
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
15
 Hash function is an arithmetic function that transforms a
key into an address and the address is used for storing and
retrieving a record
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
16
 The hash function that transforms different keys
into different addresses is called perfect hash
function
 The worth of hash function depends on how well it
avoids collision
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
17
 The maximum storage capacity that is maximum number
of records that can be accommodated is called as loading
density
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
18
Full table is the one in which all locations are occupied
Owing to the characteristics of hash functions, there
are always empty locations
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
19
 Load factor is the number of records stored in table
divided by maximum capacity of table, expressed in terms
of percentage
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
20
 Rehashing is with respect to closed hashing. When we try
to store the record with Key1 at bucket
 Hash(Key1) position and find that it already holds a
record, it is collision situation
 To handle collision, we use strategy to choose a sequence
of alternative locations Hash1(Key1), Hash2(Key1), …
within the bucket table so as to place the record with Key1
 This is called as rehashing
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
21
 Features of a Good Hash Function :
 Division Method
 Multiplication Method
 Extraction Method
 Mid-Square Hashing
 Folding Technique
 Rotation
 Universal Hashing
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
22
 The average performance of hashing depends on how the
hash function distributes the set of keys among the slots
 Assumption is that any given record is equally likely to
hash into any of the slots, independently of whether any
other record has been already hashed to it or not
 This assumption is called as simple uniform hashing
 A good hash function is the one which satisfies the
assumption of simple uniform hashing
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
23
 The average performance of hashing depends on how the
hash function distributes the set of keys among the slots
 Assumption is that any given record is equally likely to
hash into any of the slots, independently of whether any
other record has been already hashed to it or not
 This assumption is called as simple uniform hashing
 A good hash function is the one which satisfies the
assumption of simple uniform hashing
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
24
 Addresses generated from the key are uniformly and
randomly distributed
 Small variations in the value of key will cause large
variations in the record addresses to distribute records
(with similar keys) evenly
 The hashing function must minimize the collision
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
25
 One of the required features of the hash function is that
the resultant index must be within the table index range
 One simple choice for a hash function is to use the
modulus division indicated as MOD (the operator % in
C/C++)
 The function returns an integer
 If any parameter is NULL, the result is NULL
 Hash(Key) = Key % M
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
26
 The multiplication method works as:
 1. Multiply the key ‘Key’ by a constant A in the range 0
< A < 1 and extract the fractional part of Key ´ A
 2. Then multiply this value by M and take the floor of
the result
 Hash(Key) = M (Key XA MOD 1),
 where Key ´ A MOD 1 means the fractional part of
Key ´ A,
 that is,
 Key X A − Key X A and A = (sqrt (5) − 1/2 =
0.6180339887)
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
27
 When a portion of the key is used for the address
calculation, the technique is called as the extraction
method
 In digit extraction, few digits are selected and extracted
from the key which are used as the address
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
28
Key Hashed Address
345678 357
234137 243
952671 927
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
29
 The mid-square hashing suggests to take square of the key
and extract the middle digits of the squared key as address
 The difficulty is when the key is large. As the entire key
participates in the address calculation, if the key is large,
then it is very difficult to store the square of it as the
square of key should not exceed the storage limit
 So mid-square is used when the key size is less than or
equal to 4 digits
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
30
Key Square Hashed
Address
2341 5480281 802
1671 2792241 922
The difficulty of storing larger numbers square can be
overcome if for squaring we use few of digits of key instead of the
whole key
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
31
We can select a portion of key if key is larger in size and then
square the portion of it
Keys and addresses using extracting few digits, squaring
them, and again extracting mid
Key Square Hashed
Address
234137 234 x 234 = 027889 788
567187 567 x 567 = 321489 148
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
32
 In folding technique, the key is subdivided into subparts
that are combined or folded and then combined to form
the address
 For the key with digits, we can subdivide the digits in three
parts, add them up, and use the result as an address.
 Here the size of subparts of key could be as that of the
address
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
33
 There are two types of folding methods:
 Fold shift — Key value is divided into several parts of that
of the size of the address. Left, right, and middle parts are
added
 Fold boundary — Key value is divided into parts of that of
the size of the address
 Left and right parts are folded on fixed boundary
between them and the centre part
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
34
 For example, if the key is 987654321, it is understood as
Left 987 Centre 654 Right 321
 For fold shift, addition is
 987 + 654 + 321 = 1962
 Now discard digit 1 and the address is 962
 For fold boundary, addition of reverse part is
 789 + 456 + 123 = 1368
 Discard digit 1 and the address is 368
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
35
 When keys are serial, they vary in only last digit and this
leads to the creation of synonyms
 Rotating key would minimize this problem. This method
is used along with other methods
 Here, the key is rotated right by one digit and then use of
folding would avoid synonym
 For example,
 let the key be 120605, when it is rotated we get 512060
 Then further the address is calculated using any other
hash function
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
36
 The main idea behind universal hashing is to select the
hash function at random at run time from a carefully
designed set of functions
 Because of randomization, the algorithm can behave
differently on each execution; even for the same input
 This approach guarantees good average case performance,
no matter what keys are provided as input
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
37
 No hash function is perfect.
 If Hash(Key1) = Hash(Key2), then Key1 and Key2 are
synonyms and if bucket size is 1, we say that collision has
occurred
 As a consequence, we have to store the record Key2 at some
other location
 A search is made for a bucket in which a record is stored
containing Key2, using one of the several collision
resolution strategies
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
38
 Open addressing
 Linear probing
 Quadratic probing
 Double hashing, and
 Key offset
 Separate chaining (or linked list)
 Bucket hashing (defers collision but does not prevent it)
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
39
 In open addressing, when collision occurs, it is resolved by
finding an available empty location other than the home
address
 If Hash(Key) is not empty, the positions are probed in the
following sequence until an empty location is found
 When we reach the end of table, the search is wrapped
around to start and the search continues till the current
collide location
 N(Hash (Key) + C(1)), N(Hash (Key) + C(2)), …………,N(Hash
(Key) + C(i)), ….
 The most important factors to be taken care of to avoid
collision are the table size and choice of hash function
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
40
 A hash table in which a collision is resolved by putting the
item in the next empty place in following the occupied
place is called linear probing
 This strategy looks for the next free location until it is
found
 The function that we can use for probing linearly from the
next location is as follows:
 (Hash(x) + p(i)) MOD Max
 As p(i) = i for linear probing, the function becomes
 (Hash(x)+ i) MOD Max
 Initially i = 1, if the location is not empty then it
becomes 2, 3, 4, …, and so on till empty location is found.
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
41
 In open addressing, when collision occurs, it is resolved by
finding an available empty location other than the home
address
 If Hash(Key) is not empty, the positions are probed in the
following sequence until an empty location is found
 When we reach the end of table, the search is wrapped
around to start and the search continues till the current
collide location
 N(Hash (Key) + C(1)), N(Hash (Key) + C(2)), …………,N(Hash
(Key) + C(i)), ….
 The most important factors to be taken care of to avoid
collision are the table size and choice of hash function
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
42
 A hash table in which a collision is resolved by putting the
item in the next empty place in following the occupied
place is called linear probing
 This strategy looks for the next free location until it is
found
 The function that we can use for probing linearly from the
next location is as follows:
 (Hash(x) + p(i)) MOD Max
 As p(i) = i for linear probing, the function becomes
 (Hash(x)+ i) MOD Max
 Initially i = 1, if the location is not empty then it
becomes 2, 3, 4, …, and so on till empty location is found.
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
43
 With replacement
 Without replacement
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
44
 With replacement :
 If the slot is already occupied by the key there are two
possibilities, that is, either it is home address (collision)
or not key’s home address
 If the key’s actual address is different, then the new key
having the address at that slot is placed at that position
and the key with other address is placed in the next empty
position
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
45
 Without replacement :
 When some data is to be stored in hash table, and if the
slot is already occupied by the key then another empty
location is searched for a new record
 There are two possibilities when location is occupied—it is
its home address or not key’s home address.
 In both the cases, the without replacement strategy empty
position is searched for the key that is to be stored
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
46
 In quadratic probing, we add offset by amount square of
collision probe number
 In quadratic probing, the empty location is searched using the
following formula
 (Hash(Key) + i2) MOD Max where i lies between 1 to (Max − 1)/2
 Quadratic probing works much better than linear probing, but
to make full use of hash table, there are constraints on the
values of i and Max so that the address lies in table boundaries
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
47
 Double hashing uses two hash functions, one for accessing
the home address of a Key and the other for resolving the
conflict. The sequence for probing is generated in the
following sequence:
 (Hash1(Key), (Hash1(Key) + i ´ Hash2(Key)), …. i = 1, 2, 3,4,
…
 The resultant address is divided by modulo Max
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
48
Example :
Given the input {4371, 1323, 6173, 4199, 4344, 9699, 1889} and
hash function as Key % 10, show the results for the following:
1. Open addressing using linear probing
2. Open addressing using quadratic probing
3. Open addressing using double hashing h2 (x) = 7−(x MOD 7)
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
49
Initial
ly
Insert
4371
Insert
1323
Insert
6173
Insert
4199
Insert
4344
Insert
9699
Insert
1889
0 9699 9699
1 4371 4371 4371 4371 4371 4371 4371
2 1889
3 1323 1323 1323 1323 1323 1323
4 6173 6173 6173 6173 6173
5 4344 4344 4344
6
7
8
9 4199 4199 4199 4199
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
50
 Let us insert these keys using quadratic probing
 For 6173, the hashed address 6173 % 10 gives 3 and it is not empty,
hence using quadratic probing we get the address as follows:
 Hash(6173) = (6173 + 12) % 10 = 4 and as it is empty, the
key 6173 is stored there
 Now while inserting 4344, the location 4 is not empty and hence
quadratic probing generates the address as (4344 + 12) % 10 = 5
and as is empty 4344 is stored
 For key 9699, the address is (9699 + 12) % 10 = 0 and is empty so
store. While inserting 1889, the address (1889 + 12) % 10 = 0 is not
empty so probe again
 The address (1889 + 22) % 10 = 3 is not empty so probe again.
 The address(1889 + 32) % 10 = 8 is empty so store 1889 at location
8
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
51
 While inserting 6173, the address is Hash1(6173) = 6173 % 10
= 3 and 3 is not empty
 Let us use double hashing. Hence the address is as
follows:
 Hash(6173) = [Hash1(6173) + Hash2(6173)] % 1=
3 + (R − 6173 % R) ( let R be 7)= 3+ (7 − 6) = 4
 Since 4 is empty, we store 6173 at location 4
 Now let us store 4344. The address 4344 % 10 = 4 and as
location 4 is not empty, we use double hashing and we get
Hash(4344) = 7
 Now for 9699 double hashing generates address 2 and as it
is empty, we store it there.
 For key 1889, double hashing generates address 0 and as it
is empty, we store 1889 at location 0
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
52
 If table gets full, insertion using open addressing with
quadratic probing might fail or it might
 Take too much time. To find the solution for this is to
build another table that is about twice as big
 And scan down the entire original hash table, compute the
new hash value for each record, and
 Insert them in a new table
 For example, if table is of size 7 (Table 11.13) and hash
function is key % 7 then,
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
53
Insert 7,15,13,74,73
0 7
1 15
2
3 73
4 74
5
6
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
54
 This technique used to handle synonym is chaining that
chains together all the records that hash to the same
address. Instead of relocating synonyms, a linked list of
synonyms is created whose head is home address of
synonyms
 However, we need to handle pointers to form a chain of
synonyms

 The extra memory is needed for storing pointers
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
55
|
|
|
|
|
|
|
0
1
2
max-1
322 262
Fig 11.2 :An Example of Chaining
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
56
 Chaining
 Unlimited number of synonyms can be handled in
chaining
 Additional cost to be paid is overhead of multiple linked
lists
 Sequential search through chain takes more time
 Rehashing
 Limited but still a good number of synonyms are taken
care of
 The table size is doubled but no additional field of link is
to be maintained
 Searching is faster when compared to chaining
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
57
 An overflow is said to occur when a new identifier is
mapped or hashed into a full bucket
 When the bucket size is one, collision and overflow occur
simultaneously
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
58
 When a new identifier is hashed into a full bucket, we
need to find another bucket for this identifier
 The simplest solution is to find the closest unfilled
bucket.
 This is called as linear probing or linear open addressing
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
59
 Since the sizes of these lists are not known in advance, the
best way to maintain them is as linked chains
 In each slot, additional space is required for a link
 Each chain has a head node.
 The head node, however, usually is much smaller than the
other nodes, since it has to retain only a link
 As the list is accessed at random, the head nodes should
be sequential
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
60
A A2 A1 D A3 A4 GA G ZA E L …. Z
0 1 2 3 4 5 6 7 8 9 10 11 25
Fig 11.3 :Chaining
0
0
0
0
0
0
0
.
.
0
1
2
3
4
5
6
7
8
9
10
11
25
A4 0 A3 0A2A1
D 0
E 0
G GA 0
L 0
ZA Z 0
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
61
 If linear probing or separate chaining is used for collision
handling, then in case of collision, several blocks are
required to be examined to search a key and when table is
full, then expensive rehash should be used
 For fast searching and less disk access, extendible hashing
is used.
 It is a type of hash system, which treats a hash as a bit
string, and uses a trie for bucket lookup
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
62
 Many applications need a dynamic set of operations that supports only
Insert, Member (Search), and Delete. A keyed table is an effective data
structure for implementing them.
 Hashing is an excellent technique for implementing keyed tables. A hash
table is an array-based structure used to store <key, information> pairs.
 Hash tables are used to implement the insert and find in constant average
time. To store an item in a hash table, a hash function is applied to the key
of the item being stored, returning an index within the range of the hash
table.
 Hashing is a technique that is used for storing and retrieving information
associated with and that makes use of the individual characters or digits in
the key itself.
 A problem arises, however, when the hash function returns the same value
when applied to two different keys called collision. However, there are
various collision resolution techniques to overcome these problems.
Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil
63

More Related Content

PPTX
Balanced Tree (AVL Tree & Red-Black Tree)
PDF
Searching and Sorting Techniques in Data Structure
PPTX
Threaded Binary Tree
PPTX
Merkle Trees and Fusion Trees
PPTX
Ppt on Linked list,stack,queue
PPTX
NON-LINEAR DATA STRUCTURE-TREES.pptx
PPT
DATA STRUCTURES
PPTX
Linked List
Balanced Tree (AVL Tree & Red-Black Tree)
Searching and Sorting Techniques in Data Structure
Threaded Binary Tree
Merkle Trees and Fusion Trees
Ppt on Linked list,stack,queue
NON-LINEAR DATA STRUCTURE-TREES.pptx
DATA STRUCTURES
Linked List

What's hot (20)

PPTX
Segmentation in Operating Systems.
PPTX
Hashing In Data Structure
PPT
Deadlock
PPTX
Doubly Linked List
PPTX
Introduction to OOP in Python
PPT
Data Structure and Algorithms Hashing
PPT
B trees in Data Structure
PPTX
Stack & Queue using Linked List in Data Structure
DOCX
Nonrecursive predictive parsing
PPTX
Dijkstra's Algorithm
PPT
Data Structure and Algorithms Heaps and Trees
PPTX
Linear Search Presentation
PPTX
Linear search-and-binary-search
PPTX
Hash table
PDF
Time and Space Complexity
PPT
Binary search tree in data structures
PPTX
Binary Heap Tree, Data Structure
PPTX
File systems versus a dbms
PPTX
Directory structure
Segmentation in Operating Systems.
Hashing In Data Structure
Deadlock
Doubly Linked List
Introduction to OOP in Python
Data Structure and Algorithms Hashing
B trees in Data Structure
Stack & Queue using Linked List in Data Structure
Nonrecursive predictive parsing
Dijkstra's Algorithm
Data Structure and Algorithms Heaps and Trees
Linear Search Presentation
Linear search-and-binary-search
Hash table
Time and Space Complexity
Binary search tree in data structures
Binary Heap Tree, Data Structure
File systems versus a dbms
Directory structure
Ad

Viewers also liked (20)

PPT
Hashing
PPTX
Hashing Technique In Data Structures
PPT
Concept of hashing
PPT
Hashing
PDF
Hashing and Hash Tables
PPTX
6. Linked list - Data Structures using C++ by Varsha Patil
ZIP
Hashing
PDF
PPT
Hashing PPT
PPTX
7. Tree - Data Structures using C++ by Varsha Patil
PPTX
Hashing Techniques in Data Structures Part2
PPTX
9. Searching & Sorting - Data Structures using C++ by Varsha Patil
PDF
Discrete Mathematics S. Lipschutz, M. Lipson And V. H. Patil
PPTX
10. Search Tree - Data Structures using C++ by Varsha Patil
PPTX
14. Files - Data Structures using C++ by Varsha Patil
PPT
Ch17 Hashing
PPT
Indexing and hashing
PPTX
8. Graph - Data Structures using C++ by Varsha Patil
PPTX
5. Queue - Data Structures using C++ by Varsha Patil
PPTX
Hash Function
Hashing
Hashing Technique In Data Structures
Concept of hashing
Hashing
Hashing and Hash Tables
6. Linked list - Data Structures using C++ by Varsha Patil
Hashing
Hashing PPT
7. Tree - Data Structures using C++ by Varsha Patil
Hashing Techniques in Data Structures Part2
9. Searching & Sorting - Data Structures using C++ by Varsha Patil
Discrete Mathematics S. Lipschutz, M. Lipson And V. H. Patil
10. Search Tree - Data Structures using C++ by Varsha Patil
14. Files - Data Structures using C++ by Varsha Patil
Ch17 Hashing
Indexing and hashing
8. Graph - Data Structures using C++ by Varsha Patil
5. Queue - Data Structures using C++ by Varsha Patil
Hash Function
Ad

Similar to 11. Hashing - Data Structures using C++ by Varsha Patil (20)

PPT
Hashing gt1
PPTX
13. Indexing MTrees - Data Structures using C++ by Varsha Patil
PPTX
DSA Presentation of Data Structures and Algorithms.pptx
PPTX
Hashing techniques discussion and examples
PPTX
Lec12-Hash-Tables-27122022-125641pm.pptx
PPTX
hashing in data structures and its applications
PDF
DBMS 9 | Extendible Hashing
PPTX
Hashing techniques, Hashing function,Collision detection techniques
PPTX
Hashing in data structure is presented in these slides
PPT
Chapter 12 ds
PPTX
Hashing Techniques in database management systems
PPTX
Data Structures-Topic-Hashing, Collision
PPTX
Unit4 Part3.pptx
PPTX
Hashing .pptx
PDF
DataBaseManagementSystems-BTECH--UNIT-5.pdf
PPTX
Hashing for computer science studens at mit wpu.pptx
PPTX
hashing in data strutures advanced in languae java
PDF
Hashing and File Structures in Data Structure.pdf
PPTX
Hashing And Hashing Tables
PPTX
Hashing
Hashing gt1
13. Indexing MTrees - Data Structures using C++ by Varsha Patil
DSA Presentation of Data Structures and Algorithms.pptx
Hashing techniques discussion and examples
Lec12-Hash-Tables-27122022-125641pm.pptx
hashing in data structures and its applications
DBMS 9 | Extendible Hashing
Hashing techniques, Hashing function,Collision detection techniques
Hashing in data structure is presented in these slides
Chapter 12 ds
Hashing Techniques in database management systems
Data Structures-Topic-Hashing, Collision
Unit4 Part3.pptx
Hashing .pptx
DataBaseManagementSystems-BTECH--UNIT-5.pdf
Hashing for computer science studens at mit wpu.pptx
hashing in data strutures advanced in languae java
Hashing and File Structures in Data Structure.pdf
Hashing And Hashing Tables
Hashing

More from widespreadpromotion (7)

PPTX
16. Algo analysis & Design - Data Structures using C++ by Varsha Patil
PPTX
15. STL - Data Structures using C++ by Varsha Patil
PPTX
12. Heaps - Data Structures using C++ by Varsha Patil
PPTX
4. Recursion - Data Structures using C++ by Varsha Patil
PPTX
3. Stack - Data Structures using C++ by Varsha Patil
PPTX
2. Linear Data Structure Using Arrays - Data Structures using C++ by Varsha P...
PPTX
1. Fundamental Concept - Data Structures using C++ by Varsha Patil
16. Algo analysis & Design - Data Structures using C++ by Varsha Patil
15. STL - Data Structures using C++ by Varsha Patil
12. Heaps - Data Structures using C++ by Varsha Patil
4. Recursion - Data Structures using C++ by Varsha Patil
3. Stack - Data Structures using C++ by Varsha Patil
2. Linear Data Structure Using Arrays - Data Structures using C++ by Varsha P...
1. Fundamental Concept - Data Structures using C++ by Varsha Patil

Recently uploaded (20)

PPT
Technicalities in writing workshops indigenous language
PPTX
1.Introduction to orthodonti hhhgghhcs.pptx
PDF
American Journal of Multidisciplinary Research and Review
PDF
Delhi c@ll girl# cute girls in delhi with travel girls in delhi call now
PPTX
9 Bioterrorism.pptxnsbhsjdgdhdvkdbebrkndbd
PDF
Grey Minimalist Professional Project Presentation (1).pdf
PDF
General category merit rank list for neet pg
PDF
9 FinOps Tools That Simplify Cloud Cost Reporting.pdf
PPT
Classification methods in data analytics.ppt
PPTX
AI-Augmented Business Process Management Systems
PDF
book-34714 (2).pdfhjkkljgfdssawtjiiiiiujj
PPTX
Sheep Seg. Marketing Plan_C2 2025 (1).pptx
PDF
newhireacademy couselaunchedwith pri.pdf
PDF
Nucleic-Acids_-Structure-Typ...-1.pdf 011
PPTX
Overview_of_Computing_Presentation.pptxxx
PPTX
Stats annual compiled ipd opd ot br 2024
PPTX
DIGITAL DESIGN AND.pptx hhhhhhhhhhhhhhhhh
PPTX
cyber row.pptx for cyber proffesionals and hackers
PPTX
GPS sensor used agriculture land for automation
PDF
NU-MEP-Standards معايير تصميم جامعية .pdf
Technicalities in writing workshops indigenous language
1.Introduction to orthodonti hhhgghhcs.pptx
American Journal of Multidisciplinary Research and Review
Delhi c@ll girl# cute girls in delhi with travel girls in delhi call now
9 Bioterrorism.pptxnsbhsjdgdhdvkdbebrkndbd
Grey Minimalist Professional Project Presentation (1).pdf
General category merit rank list for neet pg
9 FinOps Tools That Simplify Cloud Cost Reporting.pdf
Classification methods in data analytics.ppt
AI-Augmented Business Process Management Systems
book-34714 (2).pdfhjkkljgfdssawtjiiiiiujj
Sheep Seg. Marketing Plan_C2 2025 (1).pptx
newhireacademy couselaunchedwith pri.pdf
Nucleic-Acids_-Structure-Typ...-1.pdf 011
Overview_of_Computing_Presentation.pptxxx
Stats annual compiled ipd opd ot br 2024
DIGITAL DESIGN AND.pptx hhhhhhhhhhhhhhhhh
cyber row.pptx for cyber proffesionals and hackers
GPS sensor used agriculture land for automation
NU-MEP-Standards معايير تصميم جامعية .pdf

11. Hashing - Data Structures using C++ by Varsha Patil

  • 1. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 1
  • 2. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 2  Use of hashing techniques that support very fast retrieval via a key  Factors that affect the performance of hashing  Collision resolution strategies
  • 3. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 3  Hashing is finding an address where the data is to be stored as well as located using a key with the help of the algorithmic function  Hashing is a method of directly computing the address of the record with the help of a key by using a suitable mathematical function called the hash function  A hash table is an array-based structure used to store <key, information> pairs
  • 4. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 4 key Hash(key) Address Fig 11.1:Hashing Concept
  • 5. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 5  The resulting address is used as the basis for storing and retrieving records and this address is called as home address of the record  For array to store a record in a hash table, hash function is applied to the key of the record being stored, returning an index within the range of the hash table  The item is then stored in the table of that index position
  • 6. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 6  1. With hashing, the address generated appears to be random—there is no immediately obvious connection between the key and the location of the corresponding record, even though the key is used to determine the location of the record. For this reason, hashing is sometimes referred to as randomizing  2. With hashing, two different keys may be transformed to the same address, so two records may be sent to the same place in the file. When this occurs, it is called collision and some means must be found to deal with it. The two or more records that result in the samehome address are called as synonyms
  • 7. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 7  A problem arises, however, when the hash function returns the same value when applied to two different keys  To handle the situation, where two records need to be hashed to the same address we can implement a table structure, so as to have a room for two or more members at the same index positions
  • 8. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 8  A function that maps a key into the range [0 to Max − 1], the result of which is used as an index (or address) to hash table for storing and retrieving record  The address generated by hashing function is called as home address  All home addresses address to particular area of memory and that area is called as prime area
  • 9. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 9  Bucket is an index position in hash table that can store more than one record  When the same index is mapped with two keys, then both the records are stored in the same bucket
  • 10. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 10  The result of two keys hashing into the same address is called collision
  • 11. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 11  Keys those hash to the same address are called synonyms
  • 12. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 12  The result of more keys hashing to the same address and if there is no room in the bucket, then it is said that overflow has occurred  Collision and overflow are synonymous when the bucket is of size 1
  • 13. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 13  When we allow records to be stored in potentially unlimited space, it is called as open or external hashing
  • 14. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 14  When we use fixed space for storage eventually limiting the number of records to be stored, it is called as closed or internal hashing
  • 15. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 15  Hash function is an arithmetic function that transforms a key into an address and the address is used for storing and retrieving a record
  • 16. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 16  The hash function that transforms different keys into different addresses is called perfect hash function  The worth of hash function depends on how well it avoids collision
  • 17. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 17  The maximum storage capacity that is maximum number of records that can be accommodated is called as loading density
  • 18. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 18 Full table is the one in which all locations are occupied Owing to the characteristics of hash functions, there are always empty locations
  • 19. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 19  Load factor is the number of records stored in table divided by maximum capacity of table, expressed in terms of percentage
  • 20. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 20  Rehashing is with respect to closed hashing. When we try to store the record with Key1 at bucket  Hash(Key1) position and find that it already holds a record, it is collision situation  To handle collision, we use strategy to choose a sequence of alternative locations Hash1(Key1), Hash2(Key1), … within the bucket table so as to place the record with Key1  This is called as rehashing
  • 21. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 21  Features of a Good Hash Function :  Division Method  Multiplication Method  Extraction Method  Mid-Square Hashing  Folding Technique  Rotation  Universal Hashing
  • 22. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 22  The average performance of hashing depends on how the hash function distributes the set of keys among the slots  Assumption is that any given record is equally likely to hash into any of the slots, independently of whether any other record has been already hashed to it or not  This assumption is called as simple uniform hashing  A good hash function is the one which satisfies the assumption of simple uniform hashing
  • 23. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 23  The average performance of hashing depends on how the hash function distributes the set of keys among the slots  Assumption is that any given record is equally likely to hash into any of the slots, independently of whether any other record has been already hashed to it or not  This assumption is called as simple uniform hashing  A good hash function is the one which satisfies the assumption of simple uniform hashing
  • 24. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 24  Addresses generated from the key are uniformly and randomly distributed  Small variations in the value of key will cause large variations in the record addresses to distribute records (with similar keys) evenly  The hashing function must minimize the collision
  • 25. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 25  One of the required features of the hash function is that the resultant index must be within the table index range  One simple choice for a hash function is to use the modulus division indicated as MOD (the operator % in C/C++)  The function returns an integer  If any parameter is NULL, the result is NULL  Hash(Key) = Key % M
  • 26. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 26  The multiplication method works as:  1. Multiply the key ‘Key’ by a constant A in the range 0 < A < 1 and extract the fractional part of Key ´ A  2. Then multiply this value by M and take the floor of the result  Hash(Key) = M (Key XA MOD 1),  where Key ´ A MOD 1 means the fractional part of Key ´ A,  that is,  Key X A − Key X A and A = (sqrt (5) − 1/2 = 0.6180339887)
  • 27. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 27  When a portion of the key is used for the address calculation, the technique is called as the extraction method  In digit extraction, few digits are selected and extracted from the key which are used as the address
  • 28. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 28 Key Hashed Address 345678 357 234137 243 952671 927
  • 29. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 29  The mid-square hashing suggests to take square of the key and extract the middle digits of the squared key as address  The difficulty is when the key is large. As the entire key participates in the address calculation, if the key is large, then it is very difficult to store the square of it as the square of key should not exceed the storage limit  So mid-square is used when the key size is less than or equal to 4 digits
  • 30. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 30 Key Square Hashed Address 2341 5480281 802 1671 2792241 922 The difficulty of storing larger numbers square can be overcome if for squaring we use few of digits of key instead of the whole key
  • 31. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 31 We can select a portion of key if key is larger in size and then square the portion of it Keys and addresses using extracting few digits, squaring them, and again extracting mid Key Square Hashed Address 234137 234 x 234 = 027889 788 567187 567 x 567 = 321489 148
  • 32. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 32  In folding technique, the key is subdivided into subparts that are combined or folded and then combined to form the address  For the key with digits, we can subdivide the digits in three parts, add them up, and use the result as an address.  Here the size of subparts of key could be as that of the address
  • 33. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 33  There are two types of folding methods:  Fold shift — Key value is divided into several parts of that of the size of the address. Left, right, and middle parts are added  Fold boundary — Key value is divided into parts of that of the size of the address  Left and right parts are folded on fixed boundary between them and the centre part
  • 34. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 34  For example, if the key is 987654321, it is understood as Left 987 Centre 654 Right 321  For fold shift, addition is  987 + 654 + 321 = 1962  Now discard digit 1 and the address is 962  For fold boundary, addition of reverse part is  789 + 456 + 123 = 1368  Discard digit 1 and the address is 368
  • 35. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 35  When keys are serial, they vary in only last digit and this leads to the creation of synonyms  Rotating key would minimize this problem. This method is used along with other methods  Here, the key is rotated right by one digit and then use of folding would avoid synonym  For example,  let the key be 120605, when it is rotated we get 512060  Then further the address is calculated using any other hash function
  • 36. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 36  The main idea behind universal hashing is to select the hash function at random at run time from a carefully designed set of functions  Because of randomization, the algorithm can behave differently on each execution; even for the same input  This approach guarantees good average case performance, no matter what keys are provided as input
  • 37. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 37  No hash function is perfect.  If Hash(Key1) = Hash(Key2), then Key1 and Key2 are synonyms and if bucket size is 1, we say that collision has occurred  As a consequence, we have to store the record Key2 at some other location  A search is made for a bucket in which a record is stored containing Key2, using one of the several collision resolution strategies
  • 38. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 38  Open addressing  Linear probing  Quadratic probing  Double hashing, and  Key offset  Separate chaining (or linked list)  Bucket hashing (defers collision but does not prevent it)
  • 39. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 39  In open addressing, when collision occurs, it is resolved by finding an available empty location other than the home address  If Hash(Key) is not empty, the positions are probed in the following sequence until an empty location is found  When we reach the end of table, the search is wrapped around to start and the search continues till the current collide location  N(Hash (Key) + C(1)), N(Hash (Key) + C(2)), …………,N(Hash (Key) + C(i)), ….  The most important factors to be taken care of to avoid collision are the table size and choice of hash function
  • 40. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 40  A hash table in which a collision is resolved by putting the item in the next empty place in following the occupied place is called linear probing  This strategy looks for the next free location until it is found  The function that we can use for probing linearly from the next location is as follows:  (Hash(x) + p(i)) MOD Max  As p(i) = i for linear probing, the function becomes  (Hash(x)+ i) MOD Max  Initially i = 1, if the location is not empty then it becomes 2, 3, 4, …, and so on till empty location is found.
  • 41. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 41  In open addressing, when collision occurs, it is resolved by finding an available empty location other than the home address  If Hash(Key) is not empty, the positions are probed in the following sequence until an empty location is found  When we reach the end of table, the search is wrapped around to start and the search continues till the current collide location  N(Hash (Key) + C(1)), N(Hash (Key) + C(2)), …………,N(Hash (Key) + C(i)), ….  The most important factors to be taken care of to avoid collision are the table size and choice of hash function
  • 42. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 42  A hash table in which a collision is resolved by putting the item in the next empty place in following the occupied place is called linear probing  This strategy looks for the next free location until it is found  The function that we can use for probing linearly from the next location is as follows:  (Hash(x) + p(i)) MOD Max  As p(i) = i for linear probing, the function becomes  (Hash(x)+ i) MOD Max  Initially i = 1, if the location is not empty then it becomes 2, 3, 4, …, and so on till empty location is found.
  • 43. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 43  With replacement  Without replacement
  • 44. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 44  With replacement :  If the slot is already occupied by the key there are two possibilities, that is, either it is home address (collision) or not key’s home address  If the key’s actual address is different, then the new key having the address at that slot is placed at that position and the key with other address is placed in the next empty position
  • 45. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 45  Without replacement :  When some data is to be stored in hash table, and if the slot is already occupied by the key then another empty location is searched for a new record  There are two possibilities when location is occupied—it is its home address or not key’s home address.  In both the cases, the without replacement strategy empty position is searched for the key that is to be stored
  • 46. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 46  In quadratic probing, we add offset by amount square of collision probe number  In quadratic probing, the empty location is searched using the following formula  (Hash(Key) + i2) MOD Max where i lies between 1 to (Max − 1)/2  Quadratic probing works much better than linear probing, but to make full use of hash table, there are constraints on the values of i and Max so that the address lies in table boundaries
  • 47. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 47  Double hashing uses two hash functions, one for accessing the home address of a Key and the other for resolving the conflict. The sequence for probing is generated in the following sequence:  (Hash1(Key), (Hash1(Key) + i ´ Hash2(Key)), …. i = 1, 2, 3,4, …  The resultant address is divided by modulo Max
  • 48. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 48 Example : Given the input {4371, 1323, 6173, 4199, 4344, 9699, 1889} and hash function as Key % 10, show the results for the following: 1. Open addressing using linear probing 2. Open addressing using quadratic probing 3. Open addressing using double hashing h2 (x) = 7−(x MOD 7)
  • 49. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 49 Initial ly Insert 4371 Insert 1323 Insert 6173 Insert 4199 Insert 4344 Insert 9699 Insert 1889 0 9699 9699 1 4371 4371 4371 4371 4371 4371 4371 2 1889 3 1323 1323 1323 1323 1323 1323 4 6173 6173 6173 6173 6173 5 4344 4344 4344 6 7 8 9 4199 4199 4199 4199
  • 50. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 50  Let us insert these keys using quadratic probing  For 6173, the hashed address 6173 % 10 gives 3 and it is not empty, hence using quadratic probing we get the address as follows:  Hash(6173) = (6173 + 12) % 10 = 4 and as it is empty, the key 6173 is stored there  Now while inserting 4344, the location 4 is not empty and hence quadratic probing generates the address as (4344 + 12) % 10 = 5 and as is empty 4344 is stored  For key 9699, the address is (9699 + 12) % 10 = 0 and is empty so store. While inserting 1889, the address (1889 + 12) % 10 = 0 is not empty so probe again  The address (1889 + 22) % 10 = 3 is not empty so probe again.  The address(1889 + 32) % 10 = 8 is empty so store 1889 at location 8
  • 51. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 51  While inserting 6173, the address is Hash1(6173) = 6173 % 10 = 3 and 3 is not empty  Let us use double hashing. Hence the address is as follows:  Hash(6173) = [Hash1(6173) + Hash2(6173)] % 1= 3 + (R − 6173 % R) ( let R be 7)= 3+ (7 − 6) = 4  Since 4 is empty, we store 6173 at location 4  Now let us store 4344. The address 4344 % 10 = 4 and as location 4 is not empty, we use double hashing and we get Hash(4344) = 7  Now for 9699 double hashing generates address 2 and as it is empty, we store it there.  For key 1889, double hashing generates address 0 and as it is empty, we store 1889 at location 0
  • 52. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 52  If table gets full, insertion using open addressing with quadratic probing might fail or it might  Take too much time. To find the solution for this is to build another table that is about twice as big  And scan down the entire original hash table, compute the new hash value for each record, and  Insert them in a new table  For example, if table is of size 7 (Table 11.13) and hash function is key % 7 then,
  • 53. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 53 Insert 7,15,13,74,73 0 7 1 15 2 3 73 4 74 5 6
  • 54. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 54  This technique used to handle synonym is chaining that chains together all the records that hash to the same address. Instead of relocating synonyms, a linked list of synonyms is created whose head is home address of synonyms  However, we need to handle pointers to form a chain of synonyms   The extra memory is needed for storing pointers
  • 55. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 55 | | | | | | | 0 1 2 max-1 322 262 Fig 11.2 :An Example of Chaining
  • 56. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 56  Chaining  Unlimited number of synonyms can be handled in chaining  Additional cost to be paid is overhead of multiple linked lists  Sequential search through chain takes more time  Rehashing  Limited but still a good number of synonyms are taken care of  The table size is doubled but no additional field of link is to be maintained  Searching is faster when compared to chaining
  • 57. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 57  An overflow is said to occur when a new identifier is mapped or hashed into a full bucket  When the bucket size is one, collision and overflow occur simultaneously
  • 58. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 58  When a new identifier is hashed into a full bucket, we need to find another bucket for this identifier  The simplest solution is to find the closest unfilled bucket.  This is called as linear probing or linear open addressing
  • 59. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 59  Since the sizes of these lists are not known in advance, the best way to maintain them is as linked chains  In each slot, additional space is required for a link  Each chain has a head node.  The head node, however, usually is much smaller than the other nodes, since it has to retain only a link  As the list is accessed at random, the head nodes should be sequential
  • 60. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 60 A A2 A1 D A3 A4 GA G ZA E L …. Z 0 1 2 3 4 5 6 7 8 9 10 11 25 Fig 11.3 :Chaining 0 0 0 0 0 0 0 . . 0 1 2 3 4 5 6 7 8 9 10 11 25 A4 0 A3 0A2A1 D 0 E 0 G GA 0 L 0 ZA Z 0
  • 61. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 61  If linear probing or separate chaining is used for collision handling, then in case of collision, several blocks are required to be examined to search a key and when table is full, then expensive rehash should be used  For fast searching and less disk access, extendible hashing is used.  It is a type of hash system, which treats a hash as a bit string, and uses a trie for bucket lookup
  • 62. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 62  Many applications need a dynamic set of operations that supports only Insert, Member (Search), and Delete. A keyed table is an effective data structure for implementing them.  Hashing is an excellent technique for implementing keyed tables. A hash table is an array-based structure used to store <key, information> pairs.  Hash tables are used to implement the insert and find in constant average time. To store an item in a hash table, a hash function is applied to the key of the item being stored, returning an index within the range of the hash table.  Hashing is a technique that is used for storing and retrieving information associated with and that makes use of the individual characters or digits in the key itself.  A problem arises, however, when the hash function returns the same value when applied to two different keys called collision. However, there are various collision resolution techniques to overcome these problems.
  • 63. Oxford University Press © 2012Data Structures Using C++ by Dr Varsha Patil 63