In this article we will look in some detail how the **RAID 5 parity** is created and how it is possible to actually “read” from a destroyed disk in a RAID 5 set.

There are many sources on the web about the general principle of RAID 5, so we will not be covering that part here. In short, and as you might know, the RAID level 5 works with any number of disk equal or greater than 3 and places a parity sum on one disks in the set to be able to recover from a disk failure. (Striped blocks with distributed parity.) We can for example combine eight physical disk into a RAID5 set while only consuming the size of one disk for parity information. If *any* single drive breaks down we would **still have full access to the data that was on the destroyed disk**.

To understand how this is possible we have to look at the smallest unit, the binary bit, which could be 1 or 0. When doing mathematical calculations in binary we have several so called boolean algebra operations, for example the **AND** operation and the **OR** operation.

One of these low level logical operations is used heavily in RAID5: the **XOR** (“exclusive or”). XOR takes two binary digits and produces a true result if exactly one digit is true, (i.e. the other digit needs to be false).

Value A |
Value B |
XOR result |

0 | 0 | 0 |

0 | 1 | 1 |

1 | 0 | 1 |

1 | 1 | 0 |

This means that for example **1 XOR 0 = 1**, and **1 XOR 1 = 0**. Only one binary digits may be 1 for the result to be “true”, that is, 1.

Let us now see how the parity calculations are done in a RAID 5 set using XOR. If we assume we have a small RAID 5 set of four disks and some data is written to it. For simplicity we see only a half byte (4 bits), but the principle is true no matter of the stripe size or the number of disks.

On the first three disks we have the binary information 1010, 1100 and 0011, here representing some data, and we now have to calculate the parity information for the fourth disk.

If looking at the first “column” of the disks to the left we have 1, 1 and 0. If we use XOR to calculate the result that would be:

**1 XOR 1 XOR 0 = Parity bit**

This could be written as: (1 XOR 1) XOR 0 = Parity bit

This means first **1 XOR 1 = 0** for the first two disks and then the result of that, the zero, against the bit on the third disk. That is, the first result **0 ** with the last disk, also **0**, means **0 XOR 0 = 0**, which would give the final result to **0**.

For the next “column”, to the right above, we have 0, 1 and 0. We do first **0 XOR 1 = 1** and then this result with the third disk: **1 XOR 0 = 1**. The parity bit will here be **1**.

For the third column we would have:

**1 XOR 0 XOR 1 = Parity**

Broken down: 1 XOR 0 = 1 and then 1 XOR 1 = 0

And finally the fourth column:

**0 XOR 0 XOR 1 = 1**

This will for all four columns end up with the parity sum of 0101.

If any of Disk number 1, 2 or 3 would break the parity information on Disk 4 could be used to recreate the missing data. Let us look how this is done. If we assume that disk number 2 unexpectedly goes down we have lost all read and write access to the real Disk 2, however with the help of the already recorded parity we might be able to calculate the information which is missing.

The primary feature in a RAID5 disk set is to be able to “access” the data on a missing disk. This is done by running the exact same XOR operation over the remaining disks and the parity information. Let us look at the first column again. **1 XOR 0 = 1** (for disk 1 and disk 3) and then **1 XOR 0** (the parity) **= 1**. This means that there **must** have been a binary digit of 1 on the missing disk. If we do the same operation on the other columns we will end up with 1100, which is exactly the same data that was on the failed drive.

The XOR operation itself is extremely quick and easily handled by the CPU or RAID controller, but the big downside is that we have to read against ALL other disks to recreate the data on the missing one. If having for example eight disks in the set with one broken, then a single read IO against the missing disk will create seven more disk IOs to calculate the lost data on the fly.

The XOR operation works perfect mathematically with **one disk** missing, but the moment a second disk is lost then we no longer have enough information to make the calculations. While it is possible to keep using the RAID5 set with one disk missing for some time with degraded performance, it is naturally very good to replace the damaged disk and begin the full re-creation as soon as possible (hot spare is quite handy here).

See also this blog post on the **RAID 5 write penalty**.

Would love to see a “How RAID6 actually works” from you. Or do you just add another layer with the exactly same parity?

I have plans to add blog posts about details in both RAID 10 and RAID 6, hope to be able to get time to write them down soon. Thanks for your comment!

Thank you very much for this article… Helped me to explain how RAID 5 works in my presentation.

Thanks brother

I am glad to hear that Jules, thanks for your comment.

Just a small question to get it right:

Isn’t it in RAID 5 that the calculated parities are NOT on one disk but rather spread on all disks?

Jules, that is correct, and the main difference between RAID4 and RAID5.

The parities are however stored together in a non default sized “stripe”, which could be for example 128 KB. That means that a certain amount of parity bits will be stored on the same physical disk, and then the next disk will hold the parity and so on.

In the examples above we see a very few binary bits in detail and how they are calculated, they are however assumed to be located together in on “stripe”, but since that is not explicit pointed out I understand your question. ðŸ™‚

Regards, Rickard

Thanks a lot for the explanation.

Very nice explanation. Thank you for this.

Does this mean then.. that if you removed an SAS drive from a RAID array e.g. Dot Hill AssuredSAN, you would not be able to retrieve data from an individual drive without having it connected with all the other original drives?

I am in this position, and wish to sell the Seagate SAS drives individually but do not know how to wipe them, or even if I need to, before I sell.

Thanks.

No, not quite. RAID 5 doesn’t guarantee that file data will be split among drives. It records data based on cluster size. Some data — especially small chunks of information, such as credit card numbers, and some personally identifiable information — may reside on one drive. There’s more than enough complete information on a single drive from a RAID 5 setup to make it a security risk if sensitive information was on that array.

Upshot: if you’re disposing of a single drive from a RAID-5 setup that contained sensitive data, wipe or physically destroy the drive.

simple and clear explanation..

just to be sure rickard, if i used the raid 5 with just 3 disks does it means that my 1st disk saves 0 2nd disk saves1 and my 3rd disk saves both 0 and 1?

Hello Jien, no, not really. If we assume that the third disk holds the parity for this particular data then it will store a “1”.

Hello Dear Rickard Nobel,

I have read your article about how raid 5 works and now I have few questions. (First, I have quoted some sentences from your article and then I’ve asked my questions)

>> we now have to calculate the parity information for the fourth disk.

– Does only the fourth disk include the parity bits? I read an article somewhere and it’s said that every disks in the RAID 5 set includes parity bits not just fourth disk. I’m really confused about that.

>> If any of Disk number 1, 2 or 3 would break the parity information on Disk 4 could be used to recreate the missing data.

– What if Disk 4 dies? if we loose fourth disk, everything will be gone? bacuase we’ve lost the parity information?!

Thanks.

Reza: I used to have a bit of a problem understanding this as well. The “n”th drive is not a parity drive – the data is stored on all the disks. The parity data is stored on all the disks.

If you have 4 1TB drives, you can have a RAID 5 array of 3TB.

I’ll give eight bits for three drives, and the parity. In my example, we want a parity of 1.

10010110 – Disk 1

11000011 – Disk 2

01011010 – Disk 3

11110000 – Disk 4 [parity]

(I did it deliberately. ðŸ™‚ )

Even though I labeled them Disk 1 through 4, the order doesn’t matter – the “Disk 4” is the parity stripe, and could be on any of the drives.

The idea here is to have data on N-1 disks, and the last disk is a parity stripe from which, with any N-2 disks, you can recreate a missing disk’s data.

If Disk 4 dies, we can recalculate all the missing data from the other three.

If it died and we replaced it, the system will rebuild the data – – If it was a data stripe, the parity stripe will let us “rebuild” the missing data. If it was a parity stripe, it will recalculate the parity.

RAID 4 has a dedicated parity drive

RAID 5 has the parity spread out so that each drive is being used for parity AND data. (This helps in a reduced system when a drive fails, as only 1/N of the data needs to be rebuilt from parity instead of all the data.)

I hope that clears it up a bit for you (a year and then some later) or for others who also wanted this answered.

Pingback: [ASK] server - questions about Raid 5 [on hold] | Some Piece of Information

You are an actual legend Rickard.