RAID explanation please
Posted by: Steeve on 03 December 2008
Hi
I have been reading various sites and attempting to under the principal of RAID storage.
Now, I can understand what RAID 1 is as this is just duplicating the data on one drive as a mirror image on a second drive as is used in the HDX.
But, can someone please explain, in VERY VERY simple terms for me, exactly how RAID 5 and 6 works. From what I have read it appears to be able to somehow restore three drives worth of data from just one drive of data which I can't get my head around! How does it do this? And what are the implications for music data storage of this arrangement?
Thanks
Steeve
I have been reading various sites and attempting to under the principal of RAID storage.
Now, I can understand what RAID 1 is as this is just duplicating the data on one drive as a mirror image on a second drive as is used in the HDX.
But, can someone please explain, in VERY VERY simple terms for me, exactly how RAID 5 and 6 works. From what I have read it appears to be able to somehow restore three drives worth of data from just one drive of data which I can't get my head around! How does it do this? And what are the implications for music data storage of this arrangement?
Thanks
Steeve
Posted on: 03 December 2008 by Jono 13
quote:Originally posted by Steeve:
restore three drives worth of data from just one drive of data which I can't get my head around! How does it do this?
Steeve
Oops other way around. RAID (Redundant Array of Inexpensive Devices) 5 stripes the data across a number of drives (n) with the n - 1 drive as the parity drive to enable the rebuilding of array should a drive fail. E.G. 4 off 500GB drives gives 1.5TB plus 500GB parity. For real security mirroring this array, RAID 50, will need 8 off 500GB drives, but will still only provide 1.5TB of storage.
Hope this explains the principle. As for data storage of music you should hear no difference as the data stream presented to the DAC is the same as if coming from a single disk. The only time RAID 5 is an issue is if you have a lot of users accessing a large database/application at the same time then other configurations work better.
Jono
Posted on: 03 December 2008 by Adam Meredith
quote:Originally posted by Steeve:
But, can someone please explain, in VERY VERY simple terms for me, exactly how...
I suspect there is still an opportunity for this.
Posted on: 03 December 2008 by Steeve
Thanks but not quite simple enough I'm afraid!
What does 'striping' and 'an array' in this context mean? Is it like an indexing system?
I understand that as the data in your example is on three drives in a RAID5 setup, this arrangement allows for the failure of one drive so therefore would only need to restore 500GB worth of data but I still don't understand exactly how it knows what data to restore.
remember....SIMPLE!!!!...please....
What does 'striping' and 'an array' in this context mean? Is it like an indexing system?
I understand that as the data in your example is on three drives in a RAID5 setup, this arrangement allows for the failure of one drive so therefore would only need to restore 500GB worth of data but I still don't understand exactly how it knows what data to restore.
remember....SIMPLE!!!!...please....
Posted on: 03 December 2008 by Steeve
quote:Originally posted by Adam Meredith:quote:Originally posted by Steeve:
But, can someone please explain, in VERY VERY simple terms for me, exactly how...
I suspect there is still an opportunity for this.
Posted on: 03 December 2008 by garyi
Posted on: 03 December 2008 by David Dever
quote:Now, I can understand what RAID 1 is as this is just duplicating the data on one drive as a mirror image on a second drive as is used in the HDX.
The HDX, and the NaimNet servers, do not use the drives in a (by definition) RAID 1 configuration:
- slave drive (data partition only) is not identical to master drive (embedded OS + data partitions)
- slave drive spins up only when necessary, therefore cannot be fully redundant (see below), as any data added to the data partition on the master drive since the prior (last) backup is lost, if this volume fails (though the amount of lost data is typically restricted to a day's worth of data, given factory-set, daily scheduled differential backups)
- backup scheme is timely, though not continuous, and can be paused or canceled by end user (via Desktop Client)
Conversely, a RAID 1 array utilizes two identically-sized volumes (not technically required that the drives themselves be the same size or type), with continuous parallel writes to both volumes, such that at any time T the contents of both volumes are mirror-imaged, i.e., identical.
Posted on: 03 December 2008 by pcstockton
Wikipedia is pretty cool... check it out sometime.
http://en.wikipedia.org/wiki/Redundant_array_of_independent_disks
very SIMPLE
from the above page:
" * RAID 5 (striped disks with parity) combines three or more disks in a way that protects data against loss of any one disk; the storage capacity of the array is reduced by one disk.
* RAID 6 (striped disks with dual parity) (less common) can recover from the loss of two disks."
http://en.wikipedia.org/wiki/Redundant_array_of_independent_disks
very SIMPLE
from the above page:
" * RAID 5 (striped disks with parity) combines three or more disks in a way that protects data against loss of any one disk; the storage capacity of the array is reduced by one disk.
* RAID 6 (striped disks with dual parity) (less common) can recover from the loss of two disks."
Posted on: 04 December 2008 by Trev
Steve
My understanding is this, although I could be wrong.
In a raid system you have a number of disks plus one spare. When data is saved it is striped over the discs in use. Bear in mind the one spare is not used.
If one of the discs in use fails, then the other discs can reconstruct the data which was on the failed disc and write this onto the spare.
Thus the failed disc is replaced with a new spare.
My understanding is this, although I could be wrong.
In a raid system you have a number of disks plus one spare. When data is saved it is striped over the discs in use. Bear in mind the one spare is not used.
If one of the discs in use fails, then the other discs can reconstruct the data which was on the failed disc and write this onto the spare.
Thus the failed disc is replaced with a new spare.
Posted on: 04 December 2008 by SB
RAID 5 is "striping with distributed parity"
Typical RAID 5 uses 5 drive sets. In simple terms each disc write is split across four of the five drives (striped). A parity calculation is then made and the parity data is written on remaining drive. In the event of a single drive failure, the original data can be recovered using the parity value. "Distributed parity" means that the parity bit is rotated across all the drives, not just written to the last drive.
i.e. on the first write data is on drives 1-4, parity on 5, second write, data is on drives 2-5, parity on 1, etc. This prevents a performance bottleneck on the parity drive.
Data is still available from the remaining drives and the single failed drive can be replaced and the data recreated on the new drive.
Advantage of RAID level 5 is data protection without massive penalty on storage efficiency. i.e. a 20% penalty on a 5 drive set. Downside is a hit on write performance as the parity calculation is needed on each write.
There is a rule of thumb on RAID, "fast, safe, cheap, choose any two".
RAID 1 (mirroring) is safe and reasonable fast, but not cheap as you need double the amount of disc capacity to deliver mirroring.
The RAID level is a standard definition.
Typical RAID 5 uses 5 drive sets. In simple terms each disc write is split across four of the five drives (striped). A parity calculation is then made and the parity data is written on remaining drive. In the event of a single drive failure, the original data can be recovered using the parity value. "Distributed parity" means that the parity bit is rotated across all the drives, not just written to the last drive.
i.e. on the first write data is on drives 1-4, parity on 5, second write, data is on drives 2-5, parity on 1, etc. This prevents a performance bottleneck on the parity drive.
Data is still available from the remaining drives and the single failed drive can be replaced and the data recreated on the new drive.
Advantage of RAID level 5 is data protection without massive penalty on storage efficiency. i.e. a 20% penalty on a 5 drive set. Downside is a hit on write performance as the parity calculation is needed on each write.
There is a rule of thumb on RAID, "fast, safe, cheap, choose any two".
RAID 1 (mirroring) is safe and reasonable fast, but not cheap as you need double the amount of disc capacity to deliver mirroring.
The RAID level is a standard definition.
Posted on: 05 December 2008 by nkrgovic
Since I do this for a living, and often need to explain this to people, I'll try to do this here as simple as possible:
Both RAID 1 and RAID 5 protect your data from the loss of a single disk. The only difference a user will see is that RAID 1 uses more disks, and RAID 5 is slower. For the purpose of storing music for streaming both are more than fast enough.
RAID 10 is just RAID 1 over many pairs of disks. It is very fast, and, while this is useful for databases, servers, or similar purposes, it doesn't affect the sound or audio streaming. With RAID 10 your data might survive two disks failing, but there is no guarantee - it depends on which disks failed.
RAID 6 is like RAID 5, but it protects against two disks failing, and waists more disk space.
The only reason not to use RAID 5 is:
- You need 3 disks. RAID 1 works with 2.
- You can't boot off RAID 5, unless you have a dedicated hardware RAID controller. So, you can't use software RAID 5 for the entire system.
Both RAID 1 and RAID 5 protect your data from the loss of a single disk. The only difference a user will see is that RAID 1 uses more disks, and RAID 5 is slower. For the purpose of storing music for streaming both are more than fast enough.
RAID 10 is just RAID 1 over many pairs of disks. It is very fast, and, while this is useful for databases, servers, or similar purposes, it doesn't affect the sound or audio streaming. With RAID 10 your data might survive two disks failing, but there is no guarantee - it depends on which disks failed.
RAID 6 is like RAID 5, but it protects against two disks failing, and waists more disk space.
The only reason not to use RAID 5 is:
- You need 3 disks. RAID 1 works with 2.
- You can't boot off RAID 5, unless you have a dedicated hardware RAID controller. So, you can't use software RAID 5 for the entire system.