File Organization in DBMS | Set 3

Last Updated : 19 Sep, 2023

B+ Tree, as the name suggests, uses a tree-like structure to store records in a File. It uses the concept of Key indexing where the primary key is used to sort the records. For each primary key, an index value is generated and mapped with the record. An index of a record is the address of the record in the file.

B+ Tree is very similar to a binary search tree, with the only difference being that instead of just two children, it can have more than two. All the information is stored in a leaf node and the intermediate nodes act as a pointer to the leaf nodes. The information in leaf nodes always remains a sorted sequential linked list.

In the above diagram, 56 is the root node which is also called the main node of the tree.
The intermediate nodes here, just consist of the address of leaf nodes. They do not contain any actual records. Leaf nodes consist of the actual record. All leaf nodes are balanced.

Advantages of B+ Tree File Organization

Tree traversal is easier and faster.
Searching becomes easy as all records are stored only in leaf nodes and are sorted in sequentially linked lists.
There is no restriction on B+ tree size. It may grow/shrink as the size of the data increases/decreases.

Disadvantages of B+ Tree File Organization

Inefficient for static tables.

Cluster File Organization

In Cluster file organization, two or more related tables/records are stored within the same file known as clusters. These files will have two or more tables in the same data block and the key attributes which are used to map these tables together are stored only once.

Thus it lowers the cost of searching and retrieving various records in different files as they are now combined and kept in a single cluster. For example, we have two tables or relation Employee and Department. These tables are related to each other.

Therefore this table is allowed to combine using a join operation and can be seen in a cluster file.

If we have to insert, update or delete any record we can directly do so. Data is sorted based on the primary key or the key with which searching is done. The cluster key is the key with which the joining of the table is performed.

Types of Cluster File Organization

There are two ways to implement this method.

Indexed Clusters: In Indexed clustering, the records are grouped based on the cluster key and stored together. The above-mentioned example of the Employee and Department relationship is an example of an Indexed Cluster where the records are based on the Department ID.
Hash Clusters: This is very much similar to an indexed cluster with the only difference that instead of storing the records based on cluster key, we generate a hash key value and store the records with the same hash key value.

Advantages of Cluster File Organization

It is basically used when multiple tables have to be joined with the same joining condition.
It gives the best output when the cardinality is 1:m.

Disadvantages of Cluster File Organization

It gives a low performance in the case of a large database.
In the case of a 1:1 cardinality, it becomes ineffective.

ISAM (Indexed Sequential Access Method):

A combination of sequential and indexed methods. Data is stored sequentially, but an index is maintained for faster access. Think of it like having a bookmark in a book that guides you to specific pages.

Advantages of ISAM :

Faster retrieval compared to pure sequential methods.
Suitable for applications with a mix of sequential and random access.

Disadvantages of ISAM :

Index maintenance can add overhead in terms of storage and update operations.
Not as efficient as fully indexed methods for random access.

Recoverability in DBMS

Smitha Dinesh Semwal

Improve

Article Tags :

Practice Tags :

Misc