The Great RAID Drama of 2010

Or why you shouldn't trust SATA controllers you buy for 10 kroner!

Low On Space

So a few weeks ago I looked at the old df -h on fawkes (my storage server) and saw there was only around 17GB free. Out of 1.4TB that's not much. Now I could have gone on a cleaning spree and freed up a little space but I was feeling adventurous and wanted to expand my RAID to four drives. Plus, who am I to grovel for space? He who vowed to never delete again? Expand it was then!

A Solution

I went to dustinhome.dk, my trusty computer peeps in DK and had a poke around. There was a 1TB Samsung whatever for around 500kr I believe. Perfect. So I ordered it and an extra SATA data cable and a power splitter. Total including delivery and tax, around 700kr. Circa $140? Not bad. Or at least I think not bad. It's been a while since I've been to ye olde computer faire.

I had previously bought a (what I though was 2 port) PCI SATA controller when I first built fawkes. Turns out the second port was eSATA, not so helpful. Anyway, the RAID array with two drives plugged into the mobo and one into the PCI card had been chugging along happily for the better part of a year with no worries. I needed one more port though. Now a few months ago my company, Readsoft, had held a used hardware action to get rid of a whole bunch of crap. It's where, among other things (my awesome HP 2U rack-mount server, but that's another story), I picked up a two port PCI SATA controller for 10kr. So I reckoned that I would just use that and problem solved!

Changing Of The Guard

I removed the old one port PCI SATA card and replaced it with the bargain card. Plugged everything in, etc, etc. I was running dangerously low on screws though (just like old times) and two of the drives were only fixed in place with one screw each. Only after I put everything together did I realise that the drive I had bought actually came with a little pack of four screws. Go figure. On a slightly related note, I had to remove the old paper covering when I opened fawkes up so when I put everything back together I had to cover her insides up again. This time I used some baking paper. Easier to cover the whole side wit less effort. When I finished covering it I was feeling chipper so I cut out a little gaffa-tape stencil with "fawkes" written on it to stick on the side. Looks awesome!

New Drive Added

Booted up, found myself a copy of the old Linux RAID howto and got to work. After a little bit of googling and studying the howto I deduced the procedure. It is pretty simple. All I had to do was to add the drive to the array and then command the array to grow to four devices. Or something like that. The reshape began automatically. I checked the old cat /proc/mdstat and to my chagrin the reshape was proceeding extremely slowly! Around 300k/s was it going with a projected ETA of over 50 days! Not something I was keen on waiting around for!

So I let it go and spent the next few days (on and off of course) searching for cases like this one. There were plenty of cases of people complaining of slow reshape speeds but they were quoting upwards of 5000k/s, nothing like what I was getting! I was a little exasperated at this point and almost resigned to waiting over a month for this process to finish. I accidentally discovered a remedy whilst I was syncing the archive to fawkes. It seemed that the disk activity caused by the syncing was forcing the reshape speed up! It got up to around 5000k/s during the sync and dropped off back to 300k/s when the disc activity ceased. Well I thought, it's not pretty but at this point I am willing to settle for a dirty hack if it gets this done.

A Filthy Hack

I cruised the net for a Linux drive benchmark/testing utility which would stimulate the array for me and speed up the reshape. I settled (well, to be honest I didn't search long) on fio. With minimal fuss I installed and configured it. There was a helpful article on how to benchmark your drives with it. I set it up to do a bunch of random reads and writes to a temp dir on the mounted RAID array. With that in place the disk would go crazy for a few minutes at a time, pausing for about a minute in between. When there was high disk activity the reshape speed would greatly pick up, peaking at around 5000-6000k/s then dropping back down to 100-300k/s when the activity dropped. Not pretty but it seemed to be working. Reshape ETA, a few days.

Oh The Horror!

To my horror I checked the reshape status when I got home from work one day and discovered that one of the drives had failed! After freaking out for a bit that I might have just lost all of my data (the important stuff being backed up of course) I thought a bit about what I could do. After a little googling I decided that my best course of action was to buy a replacement drive for the one which failed, add it as a spare to the array and hope for the best.

A Turn For The Better

So I got on my bike and popped down to Frederiksberg and a computer store on Falkoner Alle. Picked up a 1TB Western Digital with 64MB cache for 600 odd kroner. When I got back home I removed the failed drive, and popped in the new one. Booted up and force assembled the array. The new drive showed up as a spare. I checked the reshape speed and lo and behold it was going at around 25000k/s! Could this be it, my deliverance? For now it seemed so. The reshape finished quickly. It took around 4 more hours I think. I was somewhat excited when it did since I assumed that that would be all there was to it. Not quite it would seem. Once it finished the reshape it automatically began resyncing with the spare drive. It was as expected since the array was degraded and it needed to be whole post haste.

The Drama Continues

I checked /proc/mdstat to see how the reshape was going and to my disgust it was going evenslower than the reshape was! Around 70k/s. Now I was not really a happy camper. I had just gone to all this effort to get the reshape finished and now the last thing between me and a healthy RAID array; this minor nuisance, had metamorphosed into a daemon. A spectre from the very pits of hell! I seemed to be back at square one. WTF? Thought I. WTF indeed. Could it be the cable, the controller? Could I have removed the wrong drive? I wasn't sure. I had earlier tested swapping out the cable and the same problem had occurred so I was sure it wasn't that. That left the unpleasant dilemma that the reshape was inherently slow (it was a freeganed computer) and I would have to tough it out, or that the controller was fucked and I would have to swap it out and hope for the best.

I decided to use fio again to force the resync to speed up. Maybe not the wisest of ideas but at this point I was pretty much sick of this whole shamozzle and just wanted my damn RAID array back! So I fired up fio again and the resync sped up to around 3000k/s. Fine, a few more days and all my troubles will be behind me. Yeah right.

The Second Coming of The Beast

The resync proceeded relatively well for the next day but on Sunday evening a fail event was detected on the array and I got a lovely email telling me of this fact. This was in fact, very bad :( Now there were not even enough drives to start the array degraded. If the drive had actually failed then this meant that I had just lost all my data. I was not a happy chappy at this point. The fact that another drive had failed so recently after the previous one, and on the same controller though left me with a small ray of hope peeking through the stormy clouds of tragedy that were covering the township of hope. I suspected that there was a good chance that the 10kr SATA controller was to blame. That the drive was not actually failed but that the controller was playing silly-buggers with everything. It was the only thing that I had changed recently in the machine and then all this had happened, so the indications were there. If this was true then all I would need to do would be to swap out the controller and everything (in theory) would be OK.

So I decided that I would buy a new SATA controller. This time though I would get a 4 port one so I could eliminate the possibility of something being wrong with the mobo controller too (or the interaction between the mobo controller and the PCI one). So I logged onto the old dustinhome.dkagain to give them yet more of my hard-earned dosh. I found a reasonably priced (~450kr) one and went for it.

A Happy Ending

I ordered the card on Monday at work and it was delivered on Tuesday morning. How prompt. If only the previous order was as prompt (although I can't really lay the blame as it was sitting at the post office for most of the time I was waiting for it). After dansk on Tuesday night I installed the new card, taking out the old one and looking at it scornfully. I plugged all the drives into the new card, skipping the mobo SATA ports. Booted up, all drives detected, good. The array still wasn't built in mdstat but of course I knew it wouldn't be. I spend a nervous half hour searching the doco for how to "un-fail" a drive. Set as clean. Couldn't find anything. Eventually I just forced the assembly of the array. That did the trick. The "failed" drive was marked as clean and the resync began. It flew along at around 30000K/s. Couldn't get it faster than that but I was OK with it. The resync finished sometime during the night and in the morning I expanded the ext3 partition to fill the entire raid array. That took a couple of hours but when it was done I did a df -h and got something like this:

/dev/md0 2.1T 1.3T 682G 66% /data 

Nice.

Now of course I wonder if the original drive which failed actually did? It's definitely worth testing. Now I have a 750GB drive lying around doing nothing. sigh At least I got through this whole ordeal with all of my data intact (and hopefully not subtly corrupted!). It was a little nerve-wracking a few times there, and very frustrating. So the moral of this story is that you shouldn't use SATA cards you buy for 10kr in mission critical systems. Or something like that.