RICKARD NOBEL AB

RICKARD NOBEL AB

Specialists in IT infrastructure services

Menu
  • About
  • Windows
  • Networking
  • VMware
  • Storage
Menu

How RAID 5 actually works

Posted on July 26, 2011May 7, 2013 by Rickard Nobel

In this article we will look in some detail how the RAID 5 parity is created and how it is possible to actually “read” from a destroyed disk in a RAID 5 set.

There are many sources on the web about the general principle of RAID 5, so we will not be covering that part here. In short, and as you might know, the RAID level 5 works with any number of disk equal or greater than 3 and places a parity sum on one disks in the set to be able to recover from a disk failure. (Striped blocks with distributed parity.) We can for example combine eight physical disk into a RAID5 set while only consuming the size of one disk for parity information. If any single drive breaks down we would still have full access to the data that was on the destroyed disk.

To understand how this is possible we have to look at the smallest unit, the binary bit, which could be 1 or 0. When doing mathematical calculations in binary we have several so called boolean algebra operations, for example the AND operation and the OR operation.

One of these low level logical operations is used heavily in RAID5: the XOR (“exclusive or”). XOR takes two binary digits and produces a true result if exactly one digit is true, (i.e. the other digit needs to be false).

Value A Value B XOR result
0 0 0
0 1 1
1 0 1
1 1 0

This means that for example 1 XOR 0 = 1, and 1 XOR 1 = 0. Only one binary digits may be 1 for the result to be “true”, that is, 1.

Let us now see how the parity calculations are done in a RAID 5 set using XOR. If we assume we have a small RAID 5 set of four disks and some data is written to it. For simplicity we see only a half byte (4 bits), but the principle is true no matter of the stripe size or the number of disks.

On the first three disks we have the binary information 1010, 1100 and 0011, here representing some data, and we now have to calculate the parity information for the fourth disk.

If looking at the first “column” of the disks to the left we have 1, 1 and 0. If we use XOR to calculate the result that would be:

1 XOR 1 XOR 0 = Parity bit

This could be written as: (1 XOR 1) XOR 0 = Parity bit

This means first 1 XOR 1 = 0 for the first two disks and then the result of that, the zero, against the bit on the third disk. That is, the first result 0 with the last disk, also 0, means 0 XOR 0 = 0, which would give the final result to 0.

For the next “column”, to the right above, we have 0, 1 and 0. We do first 0 XOR 1 = 1 and then this result with the third disk: 1 XOR 0 = 1. The parity bit will here be 1.

For the third column we would have:

1 XOR 0 XOR 1 = Parity

Broken down: 1 XOR 0 = 1 and then 1 XOR 1 = 0

And finally the fourth column:

0 XOR 0 XOR 1 = 1

This will for all four columns end up with the parity sum of 0101.

 
 
 
 
 
 

If any of Disk number 1, 2 or 3 would break the parity information on Disk 4 could be used to recreate the missing data. Let us look how this is done. If we assume that disk number 2 unexpectedly goes down we have lost all read and write access to the real Disk 2, however with the help of the already recorded parity we might be able to calculate the information which is missing.

The primary feature in a RAID5 disk set is to be able to “access” the data on a missing disk. This is done by running the exact same XOR operation over the remaining disks and the parity information. Let us look at the first column again. 1 XOR 0 = 1 (for disk 1 and disk 3) and then 1 XOR 0 (the parity) = 1. This means that there must have been a binary digit of 1 on the missing disk. If we do the same operation on the other columns we will end up with 1100, which is exactly the same data that was on the failed drive.

The XOR operation itself is extremely quick and easily handled by the CPU or RAID controller, but the big downside is that we have to read against ALL other disks to recreate the data on the missing one. If having for example eight disks in the set with one broken, then a single read IO against the missing disk will create seven more disk IOs to calculate the lost data on the fly.

The XOR operation works perfect mathematically with one disk missing, but the moment a second disk is lost then we no longer have enough information to make the calculations. While it is possible to keep using the RAID5 set with one disk missing for some time with degraded performance, it is naturally very good to replace the damaged disk and begin the full re-creation as soon as possible (hot spare is quite handy here).

See also this blog post on the RAID 5 write penalty.

32 thoughts on “How RAID 5 actually works”

  1. Anders Olsson says:
    February 5, 2013 at 14:37

    Would love to see a “How RAID6 actually works” from you. Or do you just add another layer with the exactly same parity?

    Reply
    1. Rickard Nobel says:
      February 5, 2013 at 14:44

      I have plans to add blog posts about details in both RAID 10 and RAID 6, hope to be able to get time to write them down soon. Thanks for your comment!

      Reply
  2. Jules says:
    February 19, 2015 at 11:41

    Thank you very much for this article… Helped me to explain how RAID 5 works in my presentation.

    Thanks brother

    Reply
    1. Rickard Nobel says:
      February 19, 2015 at 21:27

      I am glad to hear that Jules, thanks for your comment.

      Reply
      1. Jules says:
        March 4, 2015 at 11:47

        Just a small question to get it right:
        Isn’t it in RAID 5 that the calculated parities are NOT on one disk but rather spread on all disks?

        Reply
        1. Rickard Nobel says:
          March 4, 2015 at 12:16

          Jules, that is correct, and the main difference between RAID4 and RAID5.

          The parities are however stored together in a non default sized “stripe”, which could be for example 128 KB. That means that a certain amount of parity bits will be stored on the same physical disk, and then the next disk will hold the parity and so on.

          In the examples above we see a very few binary bits in detail and how they are calculated, they are however assumed to be located together in on “stripe”, but since that is not explicit pointed out I understand your question. 🙂

          Regards, Rickard

          Reply
  3. Solomon says:
    March 3, 2015 at 15:23

    Thanks a lot for the explanation.

    Reply
  4. Debashis says:
    June 19, 2015 at 13:24

    Very nice explanation. Thank you for this.

    Reply
  5. Robert Sides says:
    July 28, 2015 at 18:14

    Does this mean then.. that if you removed an SAS drive from a RAID array e.g. Dot Hill AssuredSAN, you would not be able to retrieve data from an individual drive without having it connected with all the other original drives?

    I am in this position, and wish to sell the Seagate SAS drives individually but do not know how to wipe them, or even if I need to, before I sell.

    Thanks.

    Reply
    1. Ian says:
      December 29, 2015 at 15:29

      No, not quite. RAID 5 doesn’t guarantee that file data will be split among drives. It records data based on cluster size. Some data — especially small chunks of information, such as credit card numbers, and some personally identifiable information — may reside on one drive. There’s more than enough complete information on a single drive from a RAID 5 setup to make it a security risk if sensitive information was on that array.

      Upshot: if you’re disposing of a single drive from a RAID-5 setup that contained sensitive data, wipe or physically destroy the drive.

      Reply
  6. GSR says:
    February 3, 2016 at 09:52

    simple and clear explanation..

    Reply
  7. jien says:
    March 3, 2016 at 03:05

    just to be sure rickard, if i used the raid 5 with just 3 disks does it means that my 1st disk saves 0 2nd disk saves1 and my 3rd disk saves both 0 and 1?

    Reply
    1. Rickard Nobel says:
      December 1, 2016 at 08:55

      Hello Jien, no, not really. If we assume that the third disk holds the parity for this particular data then it will store a “1”.

      Reply
  8. Reza says:
    June 5, 2016 at 08:06

    Hello Dear Rickard Nobel,

    I have read your article about how raid 5 works and now I have few questions. (First, I have quoted some sentences from your article and then I’ve asked my questions)

    >> we now have to calculate the parity information for the fourth disk.

    – Does only the fourth disk include the parity bits? I read an article somewhere and it’s said that every disks in the RAID 5 set includes parity bits not just fourth disk. I’m really confused about that.

    >> If any of Disk number 1, 2 or 3 would break the parity information on Disk 4 could be used to recreate the missing data.

    – What if Disk 4 dies? if we loose fourth disk, everything will be gone? bacuase we’ve lost the parity information?!

    Thanks.

    Reply
    1. Jeff says:
      July 14, 2017 at 17:24

      Reza: I used to have a bit of a problem understanding this as well. The “n”th drive is not a parity drive – the data is stored on all the disks. The parity data is stored on all the disks.

      If you have 4 1TB drives, you can have a RAID 5 array of 3TB.
      I’ll give eight bits for three drives, and the parity. In my example, we want a parity of 1.
      10010110 – Disk 1
      11000011 – Disk 2
      01011010 – Disk 3

      11110000 – Disk 4 [parity]
      (I did it deliberately. 🙂 )

      Even though I labeled them Disk 1 through 4, the order doesn’t matter – the “Disk 4” is the parity stripe, and could be on any of the drives.

      The idea here is to have data on N-1 disks, and the last disk is a parity stripe from which, with any N-2 disks, you can recreate a missing disk’s data.

      If Disk 4 dies, we can recalculate all the missing data from the other three.
      If it died and we replaced it, the system will rebuild the data – – If it was a data stripe, the parity stripe will let us “rebuild” the missing data. If it was a parity stripe, it will recalculate the parity.

      RAID 4 has a dedicated parity drive
      RAID 5 has the parity spread out so that each drive is being used for parity AND data. (This helps in a reduced system when a drive fails, as only 1/N of the data needs to be rebuilt from parity instead of all the data.)

      I hope that clears it up a bit for you (a year and then some later) or for others who also wanted this answered.

      Reply
      1. masterase says:
        March 8, 2018 at 18:56

        What you forget in your example using RAID5 with 4 Disks Parity and Data will ALLWYAYS striped equaly using ALL the Disks.
        FIRST BLOCK
        Disk1 – DATA
        Disk2 – DATA
        DISK3 – DATA
        DISK4 – Calculated Parity

        NEXT BLOCK (2)
        Disk1 – Calculated Parity
        Disk2 – DATA
        DISK3 – DATA
        DISK4 – DATA

        Block 3
        Disk1 – DATA
        Disk2 – Calculated Parity
        DISK3 – DATA
        DISK4 – DATA

        Block 4
        Disk1 – DATA
        Disk2 – DATA
        DISK3 – Calclated Party
        DISK4 – DATA

        So after 4 Blocks it will look Like
        DISK 1 – 4
        DPDD
        DDPD
        DDDP
        PDDD

        DATA PARITY
        OK??

        Greetings from Germany and thanks for the great article by Rickard.

        Reply
  9. Pingback: [ASK] server - questions about Raid 5 [on hold] | Some Piece of Information
  10. Dome says:
    August 21, 2017 at 07:47

    You are an actual legend Rickard.

    Reply
  11. Paul says:
    November 9, 2017 at 20:14

    Brilliant!

    I’ve been looking for an explanation of how RAID 5 parity works with more than 3 disks for a long time. This is by far the best interpretation I’ve found.

    Thank you very much 🙂

    Reply
  12. Viju Balan says:
    December 20, 2017 at 22:31

    Thank you Sir. I have been looking for this logic for quiet sometime. explained well.

    Reply
  13. Rakesh says:
    March 11, 2018 at 05:31

    Thank you very much for this article…
    Still I have one doubt on my mind. : How the data stores in degraded RAID 5 . I mean to say ,suppose 4 real physical drives were part of one RAID 5 VD. Assume drive 4 failed unexpectedly . How the data will write to VD now? Is the parity will calculate for data or only data will stores .

    Thanks you in advance.

    Reply
    1. Rickard Nobel says:
      March 12, 2018 at 13:50

      Hello Rakesh,

      I will assume that NEW data to a degraded RAID 5 volume will be stored as if the failed disk was actually in place. That is, sometimes the parity will be missing and sometimes the data will be missing. However, when the failed disk is replaced it will be rebuilt using the remaining information and afterwards the RAID5 set will be “complete”.

      Reply
  14. Likhith says:
    March 12, 2018 at 16:14

    Thank you Sir, this article helped me a lot.
    I would like to know more about stripping and mirroring in initial levels of RAID.
    If you could illustrate with an example how binary data is stripped and mirrored in RAID 0 and RAID 1, it would help me a lot .

    Looking forward for your new article on Stripping and Mirroring.
    Thank You

    Reply
    1. Rickard Nobel says:
      March 13, 2018 at 11:08

      Hello Likhith,

      thank you for your comment.

      As for RAID 0 and RAID 1, they are actually quite straight forward.

      In RAID 0 the data is just written to each disk in set, but only once and not parity or other security.
      So, if you want to store 11110000 00001111 it would (simplified) be:
      Disk 1: 11110000
      Disk 2: 00001111

      For RAID 1 it would just be a exact copy of the data, so if you wanted to store 10101010 it would be (simplified, very small stripe size) :

      Disk 1: 10101010
      Disk 2: 10101010

      Reply
  15. Likhith says:
    March 13, 2018 at 15:31

    Thank you for your reply Sir,

    But if you could explain it in more detail how exactly stripping happens, what is that algorithm which is dividing the data and then storing it among disk.

    unlike how you explained how data is regenerated using parity in RAID 5, similarly please explain how stripping happens when a byte of data is fed into the RAID 1.

    thank you

    Reply
    1. Rickard Nobel says:
      March 13, 2018 at 15:33

      Hello Likhith,

      the actual distribution depends on something called “stripe size” which is the amount of data written to the same disk. This size is not a standard but will differ depending on the actual RAID implementation.

      Reply
  16. Likhith says:
    March 13, 2018 at 16:02

    Thank You so much for sharing your knowledge Sir. it helped me a lot .

    Reply
  17. Hari says:
    April 23, 2018 at 19:56

    This is by far the best explanation for raid 5 that I’ve come across. Does using a hot spare with a raid 5 array increase the tolerance to 2 disk failures considering the data of the 1st failed drive is already rebuilt to my hot spare and the 2nd failed drive will not result in any data loss ?

    Reply
    1. Rickard Nobel says:
      May 8, 2018 at 14:49

      Hello Hari,

      no, having a spare disk could not be said to truly increase the fault tolerance to two disks. In a “true” 2-disk-fault-tolerance system, like RAID6, any two disks could break at the same second and the volume would still be available.

      This would not be possible with RAID5 with spare disk since the spare disk has to be rebuilt before another drive could fail. If however you have your RAID5 + spare and one disk did fail, the spare was rebuilt and the next day a second disk would fail – then the volume would still work. BUT – not guaranteed – if the failures come close in time.

      Reply
  18. Anup Agarwal says:
    August 23, 2019 at 19:18

    So, can we say that the parity disk actually holds the data which is calculated from other disks during each block striping?

    Reply
  19. Pingback: Backup vs RAID – Buy Cpanel/Whm cheap license Nulled 2019
  20. Pingback: Backup vs RAID - Web Hosting

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recent Posts

  • Verify NTP connectivity in Windows
  • The Ethertype value, part 1
  • Password strength part 1, the mathematical basics
  • MS16-072 breaks Group Policy
  • ESXi virtual machine network statistics
  • Determine the Zeroed status of Thick Lazy disk
  • Eager thick vs Lazy thick disk performance

Contact

Categories

  • Networking
  • Storage
  • VMware
  • Windows

Recent Comments

  • Rickard Nobel on VMXNET3 vs E1000E and E1000 – part 1
  • cees vos on VMXNET3 vs E1000E and E1000 – part 1
  • Filipi Souza on Storage performance: IOPS, latency and throughput
  • Backup vs RAID - Web Hosting on How RAID 5 actually works
  • Stephen on Password strength part 1, the mathematical basics
©2021 RICKARD NOBEL AB