In a previous post I talked about the theoretical possibility of using cheap USB drives as mirrored RAID backup. Testing with flash drives looked good, so when my USB backup drive died I figured it was time to give it a try.
I bought a couple of cheap off-the-shelf USB drive shells with Samsung 750GB drives in them and started to set things up. It is pretty easy to do, as these appear as standard drives once connected and powered up. Steps are:
1. Use fdisk to set the drive up to have one partition, set the partition type to fd - Linux RAID autodetect.
2. Use mdadm to create a RAID1 array with two devices.
3. Use mke2fs to create an ext3 file system
4. Mount and use!
OK, did all that, tuned the filesystem to be a little better for archiving (set the number of bytes per inode to 1M instead of the default which is somewhere around 3K), and started using it. The performance was woeful. My nightly rsnapshot took more than 2 days to run, hardly a good sign. When the second pass took more than 24 hours just to do the linked copy before it starts syncing, I decided to stop the backup, unmount the filesystem and let the mirror sync up (it was up to something like 10% at this stage). Almost a week later with no competing disk activity the sync status on the RAID array showed 45% complete. Well, this wasn't going to work.
Checking the process status, I found that each of the disks had a usb_storage process attached that was consuming up to 1.5% CPU, so 3% CPU consumed just to do syncing.
I've since reconfigured to use just one drive as a standard drive to see if it is related to the particular drive combination, and will do a bit more research to see if I can find any other reason why it would be so slow, but for me this blows the idea of a cheap USB RAID array out of the water.