Over the last few months I've run into a few RAID (mostly RAID-5) horror stories.
MegaRAID breaks killing RAID.
If you have ever had an Xserve G5 with hardware RAID, you are probably familiar with this story. Basically, what happens is that your RAID 5 will die for no apparent reason. Sometimes, a restart and removing one of the drives and replacing it will fix the issue. Sometimes you need to boot from CD and repair the RAID using the terminal commands. Sometimes, you need to re-add it in OpenFirmware.
The problems with this set up are common enough that I routinely remove the cards and set it up as a software mirror.
Xserve RAID disaster 1
The Apple XServe RAID was a nice product, since discontinued. Basically it consisted of a 3u box split into two hardware RAID arrays. The two arrays are independent, but can have software RAID applied to them to create, for example, a mirror. Apple details a number of set ups, including such things as striping the two halves and concatenating the two halves (WTF???). You can probably guess what happens next.
The Xserve RAID was set up with two arrays (RAID level 5). They were then software striped. The RAID was left unmonitored. One side failed (two bad drives on one array). All the data was gone. The only backup was several months old.
The Xserve RAID was set up from scratch, using larger drives and only using one half of the array. A backup system was implemented to create daily backups.
Xserve RAID disaster 2
Similar situation, but in this scenario they simply concatenated the two sides. Then the controller in second half started having 'issues'. This turned out a little better, because most of the data was physically on the first half of the RAID. The downside was that to access the data, the whole concatenated volume needed to be online, but as soon you accessed anything on the second half of the RAID, the concatenated volume went offline and the server needed to be restarted.
Again, there was no current backup. All the data that could be retrieved was taken off the RAID. The Xserve RAID was replaced, and a backup strategy implemented.
PC hardware RAID mirror
This is an older story. This was a cheap linux box with a "RAID" card which basically supported striping and mirroring across 2 volumes. What happened was that the RAID controller itself failed, and wrote garbage to both halves. Some data was retrievable but since this was also the boot drive, the system was down for over a day.
External RAID 5 box
This is another system set up by someone who fails to grasp the basics of RAID. This is another familiar story.
The RAID box used connected to the server via eSATA. It supported a number of useful features - RAID 5, hot spares, mixed drive capacities, web interface, visual+audible alarms, email monitoring etc.
The array was set up as a straight RAID-5. Identical drives were used, all from what turned out to be a bad batch. Two drives failed in close sequence. This company outsourced its IT, and the staff had no training so were unaware of fault until the RAID failed and went off line. They had a backup system in place, but it too was unmonitored, and they were untrained in using it.
They changed IT support companies and pro-actively sought training in how to monitor the RAID and their backup system. A hot spare and email monitoring was set up.
External RAID box.
This was another external RAID box. The company which set it up for them said that they didn't need a separate backup because two drives would need to fail.
One weekend, there was a small fire in office which set off the fire sprinklers. The RAID was destroyed, and with no off site backup they lost their data.
Xserve with hardware RAID
This company shelled out for an Intel Xserve with hardware RAID. The hardware RAID in the intel Xserve replaces the regular SATA back plane. The problem here, again, was the use of identical drives from a bad batch in combination with a hot weekend.
This company leaves their IT equipment running over the weekend, but turns the aircon off. One weekend the temperature hit 43C. They came in on Monday to a server which would not boot. The RAID controller itself failed, and two of the drives were working intermittently. Thankfully, the company had a backup of their vital data.
A RAID is not a substitute for a backup system, it is just a part of your business continuity strategy. A RAID allows you to continue working while the repair is carried out. A properly implemented RAID reduces the chances of catastrophic failure
Data recovery from a RAID-5 is far more expensive and difficult than data recovery from a mirror or a single drive. This means a good backup system is MORE important if you are running RAID-5.
Don't put all your faith in a RAID controller. Sometimes controllers go bad. Do some research. Be wary of systems that make it hard to change to a non-hardware RAID set up.
Understand the underlying technologies. If you are using hardware RAID-5 it probably uses a proprietary algorithm to write data across the array. If your concern is redundancy, then don't stripe (RAID-0) in software or hardware. Concatenation is rarely useful.
Understand Redundancy. RAID is not a simple technology. There are a number of things that trip up the inexperienced. Use different drives. You are more likely to have two drives fail in quick succession if they are from the same batch. Consider adding a hot spare or using a higher RAID level. The longer it takes to get your RAID back up and running means a greater risk of that second drive failing. Long rebuild times can increase the chances of a second drive failure above acceptable limits.
Educate Users. IT support won't always be around when something goes 'funny'. If users know what to look out for, and they know that a flashing amber light or a beep is not normal, then they can report it. The sooner the problem is noticed, generally the better the outcome is.
Have a backup strategy. An offsite backup makes sense.
I have on my hands a large number of unused PCs, including dual 1GHz P3 and 2.4GHz+ Xeon machines. What I am hoping to do is get at least one box set up with a nice flavour of linux, and set it up as a network services box. On the list of services to try to get running are DHCP, DNS, VPN, possibly NAT/firewall (although, I'd probably like that running on a different box to my DNS), LDAP, RADIUS, and Kerberos.
I'd like something rock solid and easily administerable to run the core of the internal network. DNS, DHCP and a Kerberos realm are central to that. LDAP is handy because Macs hook into it well, and RADIUS is handy for the wireless access control.
It is really a long term project, because I will basically be building it from the ground up. I have a couple of machines in mind to run it on, but they all need hardware sorted (install RAM, hard drives, cases, fans etc). And then I will need to install the OS. Ubuntu is top choice at the moment, but CentOS, Debian or even something like FreeBSD or OpenSolaris is an outside chance.
OS choice is dependant on finding the right software tools and what they run on best. Probably, have it up and running sometime in June 2010, lol
The PC was a deal. About $1000 all up. I bought it nearly complete. I had been looking for a decent case and dual 16x PCI-e board for quite a while. This guy was selling the case, PSU, mobo, cpu, and dual 8600GTSs, with a 300GB drive and 1GB RAM thrown in. All in all I was at least a couple of hundred better off buying it all together.
So I bought it, added two DVD burners, a Pioneer 212 and an ASUS with Lightscribe, the 320GB and 80GB from the old PC (the Athlon64 939pin 4000+), 2GB RAM, and two cool blue LED 12CM case fans. I hooked up the ViewSonic 22", Logitech wireless keyboard and mouse, got Vista up and running, and left it at that.
Noticed some weird issues with it. It is very fussy with RAM and SLI. Had to bump it down to 2GB from 4GB because SLI wouldn't work with 4 sticks. Not that I am using RAM on their 'certified' list. And the network dropped off regularly until I reinstalled Vista.
About the only use it gets is playing Call of Duty. But we all know that PCs are just games machines :-)
It's only been what, a year and a half?
What's happened? Got rid of a bunch of stuff, and got some more stuff.
At the moment I have: AmigaOne, 2GB RAM, Radeon 8500, SoundBlaster Live, SiI based ATA and SATA cards, DVD burner etc
Amiga A1200 PPC 240MHz + 68060@50MHz, SCSI, 256MB Fast, Voodoo 16, Realtek 10/100 NIC, Soundblaster 128, OS 3.9, 2GB HDD
A2000 - dead, but has A2320 flicker fixer, A2630 with Rocket Launcher (68030+6882@50MHz) and DKB A2632 with 96MB fast, Oktagon, 8X CD-R, 2MB Chip, 1.3/2/3.1 ROM switcher, Toccata sound card, A2095 Ethernet, EGS Spectrum graphics card. Getting another A2000 to swap the cards into.
PowerMac 6100 with G4, 264MB RAM, 4GB HDD, DVD-ROM, MacOS 9.1
PowerMac G5 1.6GHz, Bluetooth, Airport, 9600Pro, 4GB RAM, 2x 500GB HDD, OSX 10.5 Server, PCI Gigabit ethernet
Mac Mini 2GHz Core 2 Duo, 3GB RAM, 250GB HDD, 24" BenQ LCD, Leopard 10.5.2
Mac Mini 1.42GHz G4, Airport/Bluetooth, CD-RW/DVD combo, 1GB RAM, 250GB HDD, OSX 10.4 Server
PowerBook G4 1.5GHz, 2GB RAM, 160GB HDD, Tiger
Core 2 Duo 2.6GHz, 2GB RAM, 1x 320GB, 1x 300GB, 1x 80GB HDD, dual 8600GTS SLI, ThermalTake Aguila Case, Asus Striker Extreme, Vista Business 64bit, 22" ViewSonic LCD
All networked together on a Gigabit switch.
The idea is to get rid of 1 or 2 of the Macs, and the PC when I get a Mac Pro (2.8GHz 8core with Airport and 8800GT).
The way it is set up at the moment is the G5 is a fileserver, the Mac Mini G4 is a torrent box, and the Mac Mini Core 2 is the main workstation. The others get used far less often, but that's normal.
I built two boxes. One was an Athlon64 3200 in my old AthlonXP 2500 box, with new RAM and optical drive. I used an ASUS A8V-MX All in one board (the cheapest 939 I could find). I like all in ones: they always work out cheaper than adding features on to other budget boards. I paid 207.50AUD for the CPU and 65 for the MoBo. The CPU was cheap when I bought it but prices dropped about 20AUD by the time it arrived, and then another 10AUD in the week after that I used some Apple RAM I had lying around (2x 256Mb DDR-400 CL3), and the BenQ DVD-RW drive that was in my old server (and then in a firewire case). The hard drives, FDD, fans and PSU were still in the case from when it was an XP2500. I also used a Venus 11 heatsink from Thermaltake, which I had lying around for /ages/.
I ended up selling the CPU and MoBo for 300AUD after a few weeks. Which was nice.
The AthlonXP box I built was originally going to be one of two. One for my mother, and one for me (to run on the mac via VNC or Remote desktop). However, I one of the boards I got was a dud. So I ended up just building a system for my mum. It is a MSI KT4AV motherboard with a Palomino core XP1600, 512Mb DDR-400 RAM, Netgear NIC, Sony CD-RW, and (at the moment) a 20Gb HDD and TNT2 graphics. I installed the lot in an old (originally a celeron) ATX case, and added a few fans. The PSU is 300Watt, which should be adequate. Only downside, is that it is too noisy.
The hard drive is an interesting matter. Originally, I installed an 8Gb Quantum from my Mum's previous (and dead) system. After a few days use it showed the same fault as the old system had. I was temporarily worried that the hard drive was bad, and killing motherboards. Turned out that it was just a failing hard drive. So, I had to install a new hard drive. A 40Gb went in. It was from the Power Mac DA 533, and was failing SMART. I installed WinXP, and messed around with. However, since the hard drive was dying, I needed to find a replacement. Alot of tangental thinking lead to me taking the 20Gb out of the A1. I installed Win98, but now need to copy the old files over from the previous HDDs.
The other experience this MSY Athlon has had is many graphics cards. Initially, I used my Matrox 400 as it was the only AGP card (out of about 10) that was compatible with the slot. But I like the Matrox, so I decided to get another card. I picked up a TNT2 M64 32Mb for 15AUD. A nice price. The card works nicely, but initially needed reseating. In the meantime, I have used the machine for testing and flashing cards for Macs, so it has had an FX5200, 3 9700 Pros, a 9800 Pro, an X800XT PE VIVO, a GeForce2 MX200, and a S3 Savage4 (it was meant to be a GeForce2 MX400) installed. All the cards worked fine. It has also had an SiS, an ATi, and S3 PCI graphics cards installed.
:: Next Page >>
This is all about my on going fumblings with hardware. Regular entries should provide an indication of the depths of my obsession.
| Next >
|<< <||> >>|