If you think about RAID mirror (RAID 1), you can come to two theoretical ideas / conclusions:
- It should be possible to speed up reading if you read from both (or 3, 4, etc.) drives simultaneously. In theory for large files the speed-up can be similar to RAID 0.
- If data safety is more important than speed, then RAID controller should read from both drives, compare them, and if they do not match, throw an error.
About first one – that RAID 1 can speed-up reads. There are numerous references in web, forums and even if Wikipedia, that some RAID controllers support this feature. I have tested Intel Matrix Storage Technology – Intel ICH8R/ICH10R SATA RAID controller, Dell PERC, Intel Server RAID card (do not remember model). Non of them showed any performance increase compared to single drive. Do not know about hi-end RAID cards. Perhaps I should look there.
About the second – when I figured out that performance is not increased for RAID mirror, I thought, that RAID is designed for data safety, and RAID is reading from both drives simultaneously, and comparing data from both drives.
From thoughts to reality.
Today when comparing data in two drives that come from one mirror, I found that data is not the same. Data do not match. Comparing data there are differences.
At first I panicked. Drive must be faulty, and data is corrupted. I compared even more data, and more, and I started to see a pattern. I have some programming skills, and I know that when you create large file without writing any data to it, OS does this almost instantly. This is by design – OS reserves space for file in HDD, but does not write anything yet. Another observation is, when you create RAID 1 in Intel Matrix RAID, this also happens instantly. So there is no comparing, formatting going on. So if you combine these two together – RAID created from two drives where some random data resides, and the create file without writing data to it, then even if you have RAID mirror, the data in the files with “unused” parts does not match.
When I come to this conclusion, I did a Google search, and White paper form Intel about Intel Matrix Storage Technology confirmed that requests is sent to the disk which is closest to location of the requested data or which is the least busy.
So the data on RAID 1 can be different on both drives, but that does not mean that data is corrupted.
Intel Matrix Storage Console even has a feature to verify volume for errors / differences. Right click on your Volume and choose Verify Volume Data. Here are screenshots: