Thursday, November 3, 2016

info RAID 5

RAID

This article is about the data storage technology. For other uses,

RAID (originally redundant array of inexpensive disks; now commonly redundant array of independent disks) is a data storage virtualization technology that combines multiple disk drive components into a logical unit for the purposes of data redundancy or performance improvement.^[1]

Data is distributed across the drives in one of several ways, referred to as RAID levels, depending on the specific level of redundancy and performance required. The different schemes or architectures are named by the word RAID followed by a number (e.g. RAID 0, RAID 1). Each scheme provides a different balance between the key goals: reliability,availability, performance, and capacity. RAID levels greater than RAID 0 provide protection against unrecoverable (sector) read errors, as well as whole disk failure.

§History

The term "RAID" was invented by David Patterson, Garth A. Gibson, and Randy Katz at the University of California, Berkeley in 1987. In their June 1988 paper "A Case for Redundant Arrays of Inexpensive Disks (RAID)", presented at the SIGMOD conference, they argued that the top performing mainframe disk drives of the time could be beaten on performance by an array of the inexpensive drives that had been developed for the growing personal computer market. Although failures would rise in proportion to the number of drives, by configuring for redundancy, the reliability of an array could far exceed that of any large single drive.

Although not yet using that terminology, the technologies of the five levels of RAID named in the paper were used in various products prior to the papers publication,^[3] including the following:

In 1977, Norman Ken Ouchi at IBM filed a patent disclosing what was subsequently named RAID 4.
In 1986, Clark et al. at IBM filed a patent disclosing what was subsequently named RAID 5.

Industry RAID manufacturers later tended to interpret the acronym as standing for "redundant array of independent disks".

§Concept

Many RAID levels employ an error protection scheme called "parity", a widely used method in information technology to provide fault tolerance in a given set of data. Most use simple XOR, but RAID 6 uses two separate parities based respectively on addition and multiplication in a particular Galois Field or Reed–Solomon error correction.^[14]

§Standard levels

Main article: Standard RAID levels

A number of standard schemes have evolved. These are called levels. Originally, there were five RAID levels, but many variations have evolved—notably several nested levelsand many non-standard levels (mostly proprietary). RAID levels and their associated data formats are standardized by the Storage Networking Industry Association (SNIA) in the Common RAID Disk Drive Format (DDF) standard:

RAID 0: RAID 0 consists of striping, without mirroring or parity. The capacity of a RAID 0 volume is the sum of the capacities of the disks in the set, the same as with a spanned volume. There is no added redundancy for handling disk failures, just as with a spanned volume. Thus, failure of one disk causes the loss of the entire RAID 0 volume, with reduced possibilities of data recovery when compared to a broken spanned volume. Striping distributes the contents of files roughly equally among all disks in the set, which makes concurrent read or write operations on the multiple disks almost inevitable. The concurrent operations make the throughput of most read and write operations equal to the throughput of one disk multiplied by the number of disks. Increased throughput is the big benefit of RAID 0 versus spanned volume.
RAID 1: RAID 1 consists of mirroring, without parity or striping. Data is written identically to two (or more) drives, thereby producing a "mirrored set". Thus, any read request can be serviced by any drive in the set. If a request is broadcast to every drive in the set, it can be serviced by the drive that accesses the data first (depending on its seek time androtational latency), improving performance. Sustained read throughput, if the controller or software is optimized for it, approaches the sum of throughputs of every drive in the set, just as for RAID 0. Actual read throughput of most RAID 1 implementations is slower than the fastest drive. Write throughput is always slower because every drive must be updated, and the slowest drive limits the write performance. The array continues to operate as long as at least one drive is functioning.
RAID 2: RAID 2 consists of bit-level striping with dedicated Hamming-code parity. All disk spindle rotation is synchronized and data is striped such that each sequential bit is on a different drive. Hamming-code parity is calculated across corresponding bits and stored on at least one parity drive. This level is of historical significance only; although it was used on some early machines (for example, the Thinking Machines CM-2), as of 2014 it is not used by any of the commercially available systems.
RAID 3: RAID 3 consists of byte-level striping with dedicated parity. All disk spindle rotation is synchronized and data is striped such that each sequential byte is on a different drive. Parity is calculated across corresponding bytes and stored on a dedicated parity drive. Although implementations exist, RAID 3 is not commonly used in practice.
RAID 4: RAID 4 consists of block-level striping with dedicated parity. This level was previously used by NetApp, but has now been largely replaced by a proprietary implementation of RAID 4 with two parity disks, called RAID-DP.
RAID 5: RAID 5 consists of block-level striping with distributed parity. Unlike in RAID 4, parity information is distributed among the drives. It requires that all drives but one be present to operate. Upon failure of a single drive, subsequent reads can be calculated from the distributed parity such that no data is lost. RAID 5 requires at least three disks.^[11]RAID 5 is seriously affected by the general trends regarding array rebuild time and chance of failure during rebuild.^[21] In August 2012, Dell posted an advisory against the use of RAID 5 in any configuration and of RAID 50 with "Class 2 7200 RPM drives of 1 TB and higher capacity" for business-critical data.
RAID 6: RAID 6 consists of block-level striping with double distributed parity. Double parity provides fault tolerance up to two failed drives. This makes larger RAID groups more practical, especially for high-availability systems, as large-capacity drives take longer to restore. As with RAID 5, a single drive failure results in reduced performance of the entire array until the failed drive has been replaced.^[11] With a RAID 6 array, using drives from multiple sources and manufacturers, it is possible to mitigate most of the problems associated with RAID 5. The larger the drive capacities and the larger the array size, the more important it becomes to choose RAID 6 instead of RAID 5.^[23]RAID 10 also minimizes these problems.

§Nested (hybrid) RAID

Main article: Nested RAID levels

In what was originally termed hybrid RAID, many storage controllers allow RAID levels to be nested. The elements of a RAID may be either individual drives or arrays themselves. Arrays are rarely nested more than one level deep.

The final array is known as the top array. When the top array is RAID 0 (such as in RAID 1+0 and RAID 5+0), most vendors omit the "+" (yielding RAID 10 and RAID 50, respectively).

RAID 1+0: creates a striped set from a series of mirrored drives. The array can sustain multiple drive losses so long as no mirror loses all its drives.

§Non-standard levels

Main article: Non-standard RAID levels

Many configurations other than the basic numbered RAID levels are possible, and many companies, organizations, and groups have created their own non-standard configurations, in many cases designed to meet the specialized needs of a small niche group. Such configurations include the following:

Hadoop has a RAID system that generates a parity file by xor-ing a stripe of blocks in a single HDFS file.

§Implementations

The distribution of data across multiple drives can be managed either by dedicated computer hardware or by software. A software solution may be part of the operating system, or it may be part of the firmware and drivers supplied with a hardware RAID controller.

§Software-based

Software RAID implementations are now provided by many operating systems. Software RAID can be implemented as:

A component of the file system (e.g. ZFS, GPFS or Btrfs)
A layer that sits above any file system and provides parity protection to user data (e.g. RAID-F)^[31]

Some advanced file systems are designed to organize data across multiple storage devices directly (without needing the help of a third-party logical volume manager):

Btrfs supports RAID 0, RAID 1 and RAID 10 (RAID 5 and 6 are under development).

Many operating systems include basic RAID implementations:

FreeBSD supports RAID 0, RAID 1, RAID 3, and RAID 5, and all nestings via GEOM modules and ccd.^]
Linuxs md supports RAID 0, RAID 1, RAID 4, RAID 5, RAID 6, and all nestings. Certain reshaping/resizing/expanding operations are also supported.^[45]
Microsofts server operating systems support RAID 0, RAID 1, and RAID 5. Some of the Microsoft desktop operating systems support RAID. For example, Windows XP Professional supports RAID level 0, in addition to spanning multiple drives, but only if using dynamic disks and volumes. Windows XP can be modified to support RAID 0, 1, and 5.^[46] Windows 8 and Windows Server 2012 introduces a RAID-like feature known as Storage Spaces, which also allows users to specify mirroring, parity, or no redundancy on a folder-by-folder basis.
NetBSD supports RAID 0, 1, 4, and 5 via its software implementation, named RAIDframe.^[48]

If a boot drive fails, the system has to be sophisticated enough to be able to boot off the remaining drive or drives. For instance, consider a computer whose disk is configured as RAID 1 (mirrored drives); if the first drive in the array fails, then a first-stage boot loader might not be sophisticated enough to attempt loading the second-stage boot loader from the second drive as a fallback. The second-stage boot loader for FreeBSD is capable of loading a kernel from such an array.^[49]

BLENGOK

Thursday, November 3, 2016

info RAID 5

info RAID 5

RAID

§History

§Concept

§Standard levels

§Nested (hybrid) RAID

§Non-standard levels

§Implementations

§Software-based

§Firmware/driver-based