RAID
-REDUNDANT
ARRAY OF INDEPENDENT DISK
By
R.VANITHA SHREE
SKCET
RAID-intro
 RAID is an enabling technology that leverages multiple disks as part of a set,
which provides data protection against HDD failures.
 In general, RAID implementations also improve the I/O performance of
storage systems by storing data across multiple HDDs.
 “ Redundant Arrays of Inexpensive Disks” The
 term RAID has been redefined to refer to independent disks, to reflect advances
in storage technology.
Implementation of RAID-two types;
hardware and software
Software RAID
 Software RAID uses host-based software to provide RAID functions.
 It is implemented at the operating-system level and does not use a dedicated
hardware controller to manage the RAID array.
 Software RAID affects overall system performance
 Software RAID does not support all RAID levels.
 Since it is implemented at the OS level,the RAID should be OS compatible
HARWARE RAID
 The specialized RAID controller is installed in the host and HDDs are
connected to it.
 The RAID Controller interacts with the hard disks using a PCI bus.
 Manufacturers also integrate RAID controllers on motherboards.
 This integration reduces the overall cost of the system, but does not provide
the flexibility required for high-end storage systems.
 Key functions of RAID controllers are:
 Management and control of disk aggregations
 Translation of I/O requests between logical disks and physical disks
 Data regeneration in the event of disk failures.
RAID COMPONENTS
 HDDs inside a RAID array are usually contained in smaller sub-enclosures.
 These sub-enclosures, or physical arrays, hold a fixed number of HDDs.
 A subset of disks within a RAID array can be grouped to form logical
associations called logical arrays(or)RAID set
 The number of HDDs In a logical array depends on the
RAID level used.
Raid levels are defined on the basis of striping , mirroring and parity.
These techniques determine the data availability and performance
characteristics of an array
STRIPING
A RAID set is a group of disks. Within each disk, a predefined
number of contiguously addressable disk blocks are defined as strips.
 The set of aligned strips that spans across all the disks
within the RAID set is called a stripe.
 All strips in a stripe have the same number of blocks,
and decreasing strip size means that data is broken into smaller pieces
when spread across the disks.
 Strip size (also called stripe depth) describes the number of blocks in a strip,.
 Stripe width refers to the number of data strips in a stripe.
 striping may significantly improve I/O performance , no protection of data
MIRRORING
 Mirroring is a technique whereby data is stored on
two different HDDs, yielding two copies of data.
 In the event of one HDD failure,
the data is intact on the surviving HDD.
 In addition to providing complete data redundancy,
mirroring enables faster recovery from disk failure.
 MIRRORING provides only protection of data,
the amount of storage capacity needed is twice the amount of data being stored.
 Therefore, mirroring is considered expensive and is preferred for mission-
critical applications that cannot afford data loss.
 Read operation can be done, but write requires time to write in both the disks
PARITY
 Parity is a method of protecting striped data from HDD failure without the cost
of mirroring.
 An additional HDD is added to the stripe width to hold parity,
 Parity is a redundancy check that ensures full protection of
data without maintaining a full set of duplicate data.
 Parity information can be stored on separate, dedicated HDDs
or distributed across all the drives in a RAID set.
 Parity requires 25 percent extra disk space compared to mirroring requiring 100
percent extra disk space.
 Parity is recalculated every time there is a change in data. This recalculation is
time-consuming and affects the performance of the RAID controller.
RAID 0
 In a RAID 0 configuration, data is striped across the HDDs in a RAID set.
 It utilizes the full storage capacity by distributing strips of data over multiple HDDs in a
RAID set.
 When the number of drives in the array increases, performance improves because more
data can be read or written simultaneously.
 RAID 0 is used in applications that need high I/O throughput. However it does
not provide data protection , and availability.
RAID 1
 Data is mirrored to improve fault tolerance. RAID 1 is used for applications requiring
high data availability.
RAID 1+0 is also called striped mirror. The basic element of RAID 1+0 is a mirrored
pair, which means that data is first mirrored
and then both copies of data are striped across multiple HDDs in a RAID set.so that data can
be recovered,
whereas in RAID 0+1 it is viceversa,a mirrored stripe, If a drive fails , entire stripe is faulted
RAID 3
 RAID 3 stripes data for high performance and uses parity for improved
fault tolerance.
 Parity information is stored on a dedicated drive so that data can be
reconstructed if a drive fails.
 RAID 3 always reads and writes complete stripes of data across
all disks, as the drives operate in parallel. It uses byte level striping.
 RAID 3 is used in large sequential data access , such as video streams.
RAID 4
 Similar to RAID 3, but uses block level striping.
 It also has dedicated parity disk, but stripes blocks.
 RAID 4 provides good read throughput and reasonable write throughput.
RAID 5
 RAID 5 is a very versatile RAID implementation.
 The main difference bt RAID 4 and 5 is the parity location.
In RAID 5, parity is distributed across all disks.
 It overcomes write bottleneck. RAID 5 is preferred for messaging,
data mining, relational database management system (RDBMS)
implementations in which (DBAs) optimize data access.
RAID 6
 RAID 6 implementation requires at least four disks.
 RAID 6 distributes the parity across all the disks, RAID 5 writes perform better than RAID 6.
 The rebuild operation in RAID 6 may take longer than that in RAID 5 due to the presence of
two parity sets.

Raid(Storage Technology)

  • 1.
    RAID -REDUNDANT ARRAY OF INDEPENDENTDISK By R.VANITHA SHREE SKCET
  • 2.
    RAID-intro  RAID isan enabling technology that leverages multiple disks as part of a set, which provides data protection against HDD failures.  In general, RAID implementations also improve the I/O performance of storage systems by storing data across multiple HDDs.  “ Redundant Arrays of Inexpensive Disks” The  term RAID has been redefined to refer to independent disks, to reflect advances in storage technology.
  • 3.
    Implementation of RAID-twotypes; hardware and software Software RAID  Software RAID uses host-based software to provide RAID functions.  It is implemented at the operating-system level and does not use a dedicated hardware controller to manage the RAID array.  Software RAID affects overall system performance  Software RAID does not support all RAID levels.  Since it is implemented at the OS level,the RAID should be OS compatible
  • 4.
    HARWARE RAID  Thespecialized RAID controller is installed in the host and HDDs are connected to it.  The RAID Controller interacts with the hard disks using a PCI bus.  Manufacturers also integrate RAID controllers on motherboards.  This integration reduces the overall cost of the system, but does not provide the flexibility required for high-end storage systems.  Key functions of RAID controllers are:  Management and control of disk aggregations  Translation of I/O requests between logical disks and physical disks  Data regeneration in the event of disk failures.
  • 5.
    RAID COMPONENTS  HDDsinside a RAID array are usually contained in smaller sub-enclosures.  These sub-enclosures, or physical arrays, hold a fixed number of HDDs.  A subset of disks within a RAID array can be grouped to form logical associations called logical arrays(or)RAID set  The number of HDDs In a logical array depends on the RAID level used.
  • 6.
    Raid levels aredefined on the basis of striping , mirroring and parity. These techniques determine the data availability and performance characteristics of an array STRIPING A RAID set is a group of disks. Within each disk, a predefined number of contiguously addressable disk blocks are defined as strips.  The set of aligned strips that spans across all the disks within the RAID set is called a stripe.  All strips in a stripe have the same number of blocks, and decreasing strip size means that data is broken into smaller pieces when spread across the disks.  Strip size (also called stripe depth) describes the number of blocks in a strip,.  Stripe width refers to the number of data strips in a stripe.  striping may significantly improve I/O performance , no protection of data
  • 7.
    MIRRORING  Mirroring isa technique whereby data is stored on two different HDDs, yielding two copies of data.  In the event of one HDD failure, the data is intact on the surviving HDD.  In addition to providing complete data redundancy, mirroring enables faster recovery from disk failure.  MIRRORING provides only protection of data, the amount of storage capacity needed is twice the amount of data being stored.  Therefore, mirroring is considered expensive and is preferred for mission- critical applications that cannot afford data loss.  Read operation can be done, but write requires time to write in both the disks
  • 8.
    PARITY  Parity isa method of protecting striped data from HDD failure without the cost of mirroring.  An additional HDD is added to the stripe width to hold parity,  Parity is a redundancy check that ensures full protection of data without maintaining a full set of duplicate data.  Parity information can be stored on separate, dedicated HDDs or distributed across all the drives in a RAID set.  Parity requires 25 percent extra disk space compared to mirroring requiring 100 percent extra disk space.  Parity is recalculated every time there is a change in data. This recalculation is time-consuming and affects the performance of the RAID controller.
  • 9.
    RAID 0  Ina RAID 0 configuration, data is striped across the HDDs in a RAID set.  It utilizes the full storage capacity by distributing strips of data over multiple HDDs in a RAID set.  When the number of drives in the array increases, performance improves because more data can be read or written simultaneously.  RAID 0 is used in applications that need high I/O throughput. However it does not provide data protection , and availability. RAID 1  Data is mirrored to improve fault tolerance. RAID 1 is used for applications requiring high data availability. RAID 1+0 is also called striped mirror. The basic element of RAID 1+0 is a mirrored pair, which means that data is first mirrored and then both copies of data are striped across multiple HDDs in a RAID set.so that data can be recovered, whereas in RAID 0+1 it is viceversa,a mirrored stripe, If a drive fails , entire stripe is faulted
  • 10.
    RAID 3  RAID3 stripes data for high performance and uses parity for improved fault tolerance.  Parity information is stored on a dedicated drive so that data can be reconstructed if a drive fails.  RAID 3 always reads and writes complete stripes of data across all disks, as the drives operate in parallel. It uses byte level striping.  RAID 3 is used in large sequential data access , such as video streams. RAID 4  Similar to RAID 3, but uses block level striping.  It also has dedicated parity disk, but stripes blocks.  RAID 4 provides good read throughput and reasonable write throughput.
  • 11.
    RAID 5  RAID5 is a very versatile RAID implementation.  The main difference bt RAID 4 and 5 is the parity location. In RAID 5, parity is distributed across all disks.  It overcomes write bottleneck. RAID 5 is preferred for messaging, data mining, relational database management system (RDBMS) implementations in which (DBAs) optimize data access. RAID 6  RAID 6 implementation requires at least four disks.  RAID 6 distributes the parity across all the disks, RAID 5 writes perform better than RAID 6.  The rebuild operation in RAID 6 may take longer than that in RAID 5 due to the presence of two parity sets.