Should you use RAID 1 or RAID 5? RAID 5 is more work and has some advantages for specific uses but RAID 1 is easier and under used.
RAID 1 uses two disks for redundancy and extra speed. RAID 5 requires three or more disks. Your computer may limit you to a small number of disks where RAID 1 is the better option. I will start with some RAID 1 examples then cover some RAID 5 examples.
This is the critical bit you have to test on your system. Whatever you choose to build, the build time may not be important because you can often start the array building the day before you need the computer. The critical bit is replacing a broken disk. How long will the replacement take? Can you use your computer while the array rebuilds?
Build your array. Load the array up with lots of data. Switch the computer off, replace a disk, then start the computer and rebuild the array. The rebuild might take 4 hours or 14 hours or 24 hours. When the rebuild is active, you might be able to use the computer or you might not. If you can use the computer, it might be so slow that it is effectively useless.
I often use the RAID 1 plus RAID 5 combination mentioned a few times on this page. Rebuilding the RAID 1 array is fast enough to not disrupt my work. Rebuilding the RAID 5 array is long and slow but there are many things I can do using only the RAID 1 array. Rebuilding the RAID 5 array stops me performing some types of work but not everything. I can read mail, work on the web, and be productive while the RAID 5 array rebuilds.
I have tried computers set up with one big RAID 5 array using software and hardware. Some computers end up not accessible until the rebuild is finished. Most of the accessible computers are too slow to be useful during the rebuild. Some very expensive hardware RAID cards let you rebuild and use the computer at the same time but the money spent of the hardware card is far greater than the cost of adding some extra disks. The money you save by leaving out the dedicated hardware lets you replace the single RAID 5 array with a mixed RAID 1 and RAID 5 configuration and use extra fast SSDs for the RAID 1 array.
The best way to find out what will happen with your computer is to break and rebuild an array before you load the array with your critical data.
My netbook has one disk. You cannot use RAID one one disk. If you want backup, you have to use an
incremental backup to continuously copy updated files to a backup server. You may need special software to synchronise databases to the backup server.
Most desktop computers and some notebook computers let you use two disks. Two disks of the same size and type are ideal for RAID 1. Everything is written twice, once to each computer. The overheads of writing twice are almost nothing on modern computers because there is very little processing involved. When you read from disk, your operating system should alternate the reads across the two disks, making the reads faster, up to twice as fast.
The available space on a RAID 1 array is the space on one disk. If you use two disks of 500 GigaBytes, the space in the array is 500 GB. RAID 1 is less space efficient than RAID 5. With the low cost of disks, RAID 5 makes little difference to the total cost of your computer system compared to RAID 1. The deciding factors are more likely to be the number of disks you can fit in the computer and how long you can wait to rebuild an array when you have to repair an array.
Synchronising the two disks is very fast compared to the same action in RAID 5. There is very little overhead required to set up the array and very little overhead required to synchronise an array after replacing a broken disk. Everything in RAID 1 is faster than a standard desktop style RAID 5 array or the same speed. RAID 5 does not catch up until you have many disks in an array and a computer that can handle fast data transfer rates.
Synchronising two common 500 GB disks in a RAID 1 array takes about an hour to copy from one disk to the other. I have four of those same disks synchronising in a RAID 5 array and after six hours, the synchronising is less than half done. The final RAID 5 time will be 15 hours.
If you have three disks the same size and you want RAID, RAID 5 is the simple answer. On one computer, I have three disks in the form of one SSD for the operating system and two large capacity magnetic disks in a RAID 1 configuration. If I was rebuilding that computer, I would add a second SSD and use two SSDs in a RAID 1 array for the operating system so that everything is on RAID.
For three disks in RAID 5, you effectively have the equivalent of two disks storing data and one disk storing the redundant parity information used to recreate a missing disk. Back in the early days of RAID, the parity information was on a single disk and the result was named RAID 4. RAID 5 spreads both the data and parity over all the disks to even out the workload.
When you write to a RAID 5 array, your computer processor will have to use some time calculating the parity information but the time is trivial for a modern computer processor. If you see an article telling you the RAID 5 overhead is massive, check when the article was written. Articles of that type in the Linux and Unix worlds are often five or ten years old and written based on an opinion formed several years before the article was written.
The thing that really kills write speed is the stripe size. RAID 5 writes in stripes. If you want to write an 8 KiloByte change into a 64 KB stripe, the computer first has to read the 64 KB stripe then write back the updated stripe. The 8 KB change might cover the end of one stripe and the start of the next. In that case, the computer has to read two stripes of 64 KB then write back two stripes of 64 KB. For normal use, you save time by using small stripes or switching to RAID 1. Large stripes are good for video files.
Four disks is the worst decision. Assume you are using 500 GB disks. A single RAID 5 array will give you 1.5 TeraBytes of space and require 15 hours to rebuild the array (based on my current backup server). Using those same disks as two arrays of RAID 1 will give you only 1 TB of storage space and require a rebuild of only one hour for a broken disk. If you use the wrong stripe size on the RAID 5 array, the performance will be worse than RAID 1. With the right stripe size, the RAID 5 array will average a similar speed to the RAID 1 configuration.
The RAID 1 configuration has one disadvantage. You will have two separate arrays of 500 GB. The might be fine for some uses. For other uses, you might want to join the two arrays as one bigger array using something called RAID 0. RAID 0 is not RAID. RAID 0 has no redundancy so can never be RAID. RAID 0 is a fake name for disk partition concatenation that was in some operating systems long before the first RAID. You can use what is now called RAID 0 to join two of our example 500 GB disks to form one disk volume of one TB. You also use RAID 0 to join the other two disks into one disk volume of one TB. You then use RAID 1 to mirror the two one TB volumes as a single one TB RAID 1 array. Because you use RAID 1 on top of RAID 0, the result is called RAID 10.
As I mentioned before, RAID 0 is not RAID. RAID 0 has no redundancy so can never be RAID. RAID 10 offers redundancy through the RAID 1 layer. How long will recovery take for a single disk failure? First you break the RAID 1 mirroring to take you back to one working and one broken RAID 0 array. You delete the broken RAID 0 array, replace the broken disk, then recreate the RAID 0 array. This part of the work is only a few minutes or as fast as you can replace the disk. You then recreate the RAID 1 array by synchronising from the working RAID 0 component to the repaired RAID 0 component. Our example disks require an hour to synchronise one 500 GB disk and can be expected to take two hours to synchronise two disks.
RAID 0 is trivial by itself and not a way to provide redundancy. RAID 0 has to be used with something else and RAID 1 is the obvious partner. RAID 0 is sometimes used to match up different size disks. If you have two disks of 500 GB and two of 750 GB, you could use RAID 0 to create a 1.25 TB array from a 500 GB disk and a 750 GB disk. You could repeat the RAID 0 join up for the other two disks then use RAID 1 to create a 1.25 TB array. What would you get from using the same disks with RAID 5? You can use only 500 GB on the 750 GB disks. The result will be three times 500 GB or 1.5 TB. RAID 5 will be slower to create, slower to repair, and, in most situations, slower to operate, plus RAID 5 produces very little extra space because of all that wastage.
Do you use five disks in a RAID 5 array or two disks in RAID 1 plus the other three in RAID 5? Five disks in a RAID 5 array, using my example 500 GB disks in my backup server, would produce a 20 hour array creation step and possibly a 20 hour array recreation step after a disk failure. The 20 hour recreation time is not workable. If I did use five disks of 500 GB in a RAID 5 array, the result would be a two TB array.
The two disk RAID 1 plus three disk RAID 5 option would require 1 hour to recreate if the break is in the RAID 1 array and 10 hours to rebuild if the break is in the RAID 5 array. I could live with that some days but not others. This configuration would give me 1.5 TB of storage in total.
Would I use five disks? I had one computer with a motherboard containing six SATA connections and one connection was used for a DVD drive, leaving five for disks. If I was to build the same computer today, I would consider using two SSD disks in RAID 1 for the system plus a three disk RAID 5 for data.
There is another advantage to using five disks in two arrays. If you edit video files or large camera images, you can keep the large files on the RAID 5 array and set the stripe size to something large for greater efficiency. All the small and mixed size files will be on the RAID 1 array where there is no striping to confuse performance.
Consider another option to offset the long rebuild times for RAID 5. The one reason you rebuild a RAID 5 array immediately is to replace the single broken disk because you do not have redundancy until you repair the array. RAID 6 is RAID 5 with extra redundancy and could keep you going longer. If a disk broke in an array that was going to cost me a day, I could not afford to replace it during the working week but could replace it over the weekend. RAID 6 could give me that spare time to wait until replacement is convenient.
Use our example of five disks of 500 GB. In RAID 6, four disks would be used as a RAID 5 array of 1.5 TB and the fifth disk would be an active spare. The parity would be duplicated on two disks. If one disk broke, the remaining four disks would operate as a four disk RAID 5 array until I can replace the disk and rebuild.
Look at adding an extra disk to form RAID 6 when you have to use RAID 5, instead of RAID 1, and want to move the rebuild time out of your peak activity time.
Six disks or more
Lots of desktop computers can hold seven or more disks. One big RAID 5 array gives you two problems, rebuild time, and stripe size. You can solve the rebuild time by using RAID 6 and pushing the rebuild to an off peak time. Stripe size is still a problem. I always start with a system disk build on RAID 1, possibly on SSD, then consider RAID 5 or RAID 6 for the rest of the disks. The stripe size for the RAID 5 array can then be adjusted to fit the most common type of file stored on the array.
Six disks could also be two RAID 5 arrays with a small stripe size for the system system and a larger stripe size for the data array. Given the size of modern disks, this configuration rarely proves the right balance. You might use two RAID 5 arrays when you have many more disks.
You might also consider a two disk RAID 1 array for the system disk, a three disk RAID 5 array with a small stripe size for user and application data, then the rest of the disks in a RAID 5 array with a large stripe size for your large media files. A long time ago when disks where small, I looked at a photo editing computer with multiple RAID 5 arrays. I started with a
deskside tower case that could take 11 disks. The configuration was to be the system on a two disk RAID 1 array, the user application data on a four disk RAID 6 array with a small strip size, then the large image files on a five disk RAID 6 array with a large strip size. The result would be too much noise and slow complicated disk rebuilds.
While I was deciding on the final specifications for the 11 disk computer, Seagate started selling a disk that was twice as big, significantly faster, used less power, and produced less noise. I ended up building a small quiet six disk system using two small existing disks for the system RAID 1 array, two of the new large disks for a RAID 1 user array, and two of the large new disks for a RAID 1 image file array. Yes, the image file array was a bit small but by the time I started to fill it up, Seagate released another new disk that was double the size of my existing large disks. Replacing the RAID 1 array was easier and faster than replacing a RAID 5 array.
Super size arrays
The largest useful array I have looked at was a 32 disk RAID 6 array with 30 disks in RAID 5 plus two spares forming the RAID 6. Yes, two spares. Two disks can break before you lose redundancy. The combined capacity of the disks is equivalent to 29 disks instead of the 31 disks you would get from plain RAID 5. The array owner had experience of two disks breaking on the same day with the second disk breaking during the rebuild from the first broken disk.
The array owner used high quality disks from a reliable brand guaranteed to last five years. The array was in a protected environment with no voltage or temperature fluctuations. The array owner did own many similar large arrays totalling hundreds of disks. The odds of two breaking in the same day were high. They just happened to be in the same array.