Tag Archives: RAID

Intel Matrix Storage RAID 1 (mirror) – data does not match

If you think about RAID mirror (RAID 1), you can come to two theoretical ideas / conclusions:

  1. It should be possible to speed up reading if you read from both (or 3, 4, etc.) drives simultaneously. In theory for large files the speed-up can be similar to RAID 0.
  2. If data safety is more important than speed, then RAID controller should read from both drives, compare them, and if they do not match, throw an error.

About first one – that RAID 1 can speed-up reads. There are numerous references in web, forums and even if Wikipedia, that some RAID controllers support this feature. I have tested Intel Matrix Storage Technology – Intel ICH8R/ICH10R SATA RAID controller, Dell PERC, Intel Server RAID card (do not remember model). Non of them showed any performance increase compared to single drive. Do not know about hi-end RAID cards. Perhaps I should look there.

About the second – when I figured out that performance is not increased for RAID mirror, I thought, that RAID is designed for data safety, and RAID is reading from both drives simultaneously, and comparing data from both drives.

From thoughts to reality.

Today when comparing data in two drives that come from one mirror, I found that data is not the same. Data do not match. Comparing data there are differences.

At first I panicked. Drive must be faulty, and data is corrupted. I compared even more data, and more, and I started to see a pattern. I have some programming skills, and I know that when you create large file without writing any data to it, OS does this almost instantly. This is by design – OS reserves space for file in HDD, but does not write anything yet. Another observation is, when you create RAID 1 in Intel Matrix RAID, this also happens instantly. So there is no comparing, formatting going on. So if you combine these two together – RAID created from two drives where some random data resides, and the create file without writing data to it, then even if you have RAID mirror, the data in the files with “unused” parts does not match.

When I come to this conclusion, I did a Google search, and White paper form Intel about Intel Matrix Storage Technology confirmed that requests is sent to the disk which is closest to location of the requested data or which is the least busy.

So the data on RAID 1 can be different on both drives, but that does not mean that data is corrupted.

Intel Matrix Storage Console even has a feature to verify volume for errors / differences. Right click on your Volume and choose Verify Volume Data. Here are screenshots:

Verify Volume Data

Verifying

Intel RAID Volume Inaccessible and Failed – RAID1 using WD VelociRaptor

This is story about HDD – WD (Western Digital) VelociRaptor 600 GB SATA Hard Drives (WD6000HLHX) connected in RAID1 (mirror) on EVGA motherboard using Intel ICH10R wRAID5. The RAID1 and / or HDD failed, died, and then (I do not have a real explanation), revived again after reformat and RAID rebuilding.

It all begun with random reboots and Intel Matrix Storage Console reported that one of the drives is in failed state. After marking drive as normal, RAID rebuilt itself, and everything worked for a couple of days. Then, same reboots, same rebuilds, and feeling that something from hardware is dying.

After one of the reboots, computer didn’t start up. It hang on BIOS screen saying, that there are no bootable drives, and that RAID1 has failed. This time it failed showing that all RAID failed instead of just one drive.

Failed RAID Volume Detected

Intel(R) Matrix Storage Manager option ROM v8.0.0.1038 ICH10R wRAID5. Copyright(C) 2003-08 Intel Corporation. All Rights Reserved.
[MAIN MENU]
Failed RAID volume detected. Recover volume? (Y/N):

Answering Yes to this recover question, it started the rebuild process. But rebuild never finished, it hang in the middle with the following errors:

Volume Inaccessible

Volume Inaccessible
The data in a volume is no longer accessible because of failed hard drives. Click here to identify the failed drives.

Of course you cannot click on this balloon error, because system HDD / RAID have failed, and it is impossible to start / read Intel program from that drive.

Intel RAID Failed Port 0

After this, Volume Inaccessible error, it is impossible to continue anything because Windows does not recognize HDD / RAID volume any more. Something similar happens when you hot-unplug HDD.

Then we downloaded WD Data Lifeguard Diagnostic for Windows, to test for HDD errors. Tool can be downloaded here.

It is impossible to test HDD while it is in the RAID mode, so we removed drives from the RAID. Now Lifeguard Diagnostic showed errors for the one of the disks, on the port 0.

DLGDIAG Status Code 07, Failure Checkpoint 97

Western Digital Data LifeGuard Diagnostics - DLGDIAG for Windows
[DLGDIAG - QUICK TEST]
[DLGDIAG for Windows]
Quick Test on drive 1 did not complete!
Status code = 07 (Failed read test element), Failure Checkpoint = 97
(Unknown Test)
SMART self-test did not complete on drive 1!

After digging some WD PDF manuals, these codes means that HDD is unrecoverable, and need to be sent to WD for replacement. So I decided to format failed drive (privacy concerns) and send it to WD. However, to my surprise, formatting went without any problems, even more, running WD Data Lifeguard Diagnostic again showed that error is gone. But after more extensive testing showed warning that, it ‘Failed to update disk property’. Didn’t find any reference to this warning in the web, but I suspect that is caused by fact, that HDD is not formatted.

Failed to update disk property!

Western Digital Data LifeGuard Diagnostics - DLGDIAG for Windows
[DLGDIAG - WRITE ZEROS]
...
[DLGDIAG for Windows]
Failed to update disk property!

Now it runs for a couple days or more without any problems. Failed HDD is now on Port 3 instead of Port 0.

Here is Photo of the drives before going into one of our Antec Twelve Hundred Case. Sorry for low quality image.

Two WD VelociRaptor SATA HDD 600GB

WD VelociRaptor Enterprise Hard Drives (600 GB, SATA 6 Gb/s, 32 MB Cache, 10,000 RPM).

Some reviews: WD VelociRaptor 600 GB review 1, WD VelociRaptor 600 GB review 2.

I have posted some Photos instead of screen-shots, because at some point I was unable to take screen-shots, for example, in the BIOS.

Update Nov 24, 2010.

Issue appeared again on Nov 5, 2010. Now changed motherboard to another model – EVGA X58 CLASSIFIED3 motherboard. It is now 11 days using the same HDD’s without any problems (fingers crossed). Migrating RAID mirror was easy task.