Saturday, June 5, 2010

The Illusive back up

In my line of work, all photos are irreplaceable; irreplaceable in the sense that represents the progression of my work throughout the years and are simply priceless.

I spent years looking for the ultimate backup for my work. From complete standalone servers that were foisted into small closets, to huge drive arrays (12 hard drives) built right into the workstation that kept me warm over the cold winter season to smaller external raid system that tested my patience in transfer speeds, I never found the perfect backup until recently.


Data recovery is big business and they survive by the laziness of users that don't bother to backup; but backing up your data is such a chore and even automatic backup software doesn't mean you have up to the minute backup. What if your last backup was yesterday night and you have spent the whole day editing and suddenly your system fails?

Also backup means that you have to have at least twice the storage space and four times the storage space is needed if you wish to be methodical. Thus your current working storage has to be redundant so that if anything fails on your working system, you are protected. The backup system has to be redundant too since you may keep files on your backup system, but not on your working space.


Currently I am using a system that has a built in hardware raid that ensures me that when a hard drive fails, I can simply replace the failed hard drive and the system will be rebuild the data. However this is all hardware dependant, i.e. it is a function of the computer that if the computer fails the only way to recover and read the data is to have an identical spare computer that you can move the hard drives to.


Back a couple of years ago, I bought a Drobo, an expandable hard drive unit that as you need more room to store your data you simply swap out one of four drives to a larger hard drive. However after my purchase, I was able to see the real problems behind the iron forum curtains that were locked to owners only. One out of ten users were loosing their data completely. Drobo hid the fact of this data loss attrocity by locking the forums, deleted troubled posts and even kicking out owners that caused trouble. In order to get software updates which you pay for, you had to be a registered owner on the forums.


Some owners of Drobo would wake up one morning to find that their Drobo unit blinking all four led lights, indicating all four simultaneous drive failures. Although it wasn't completely Drobo's fault as at the same time the 1.5TB seagate drives were having a timing delay error that caused any Raid unit to see the slight delay as a drive failure; thus with Drobo, it caused a cascade failure as the unit attempted to recover by moving the data and causing another drive to time out.

I sold my Drobo shortly after discovering this fact. The second problem is the methodology of storage is proprietary; meaning once you copy your data into a drobo storage device, you are locked into using only Drobo units to read it. You cannot use a non-drobo drive reader since the data is stored in a drobo proprietary manner.


Subsequently, I've always been on the lookout for a NAS (network attached Storage). This is a unit that you can attach to the home network and when you have the backup whims you simply copy your files to the magical network storage device, preferably the unit is located in the closet somewhere so that you don't have to hear the fans as well as keeping your desk uncluttered.


However most NAS units are costly, usually in the range of several hundreds to several thousands and they are usually sold without hard drives. Although they may sound like the optimal backup, these units are also proprietary in their format of keeping your data. Thus if the unit fails, you will need an identical unit in order to read your data off the hard drives.

I cannot begin to tell you the trepidation you will feel as you transfer your hard drives to a new unit, knowing that if this doesn't work, you would have lost all your precious data. With the various methods that I've used over the years, the process isn't automatic after your inserted the drives. In fact some methods are just frightening as it warns you against data loss or that it will wipe out your drive despite the fact that it is the only method to reconnect your drives together in order to read the data.

Every time a solution presents itself, another brick wall is encountered. Such as the 4 drive Mediasonic device that allowed me to build a 4TB (terabyte, a 4000 GB) storage device but due to my dinosaur computer hardware, I was only able to read and write to a max of 2TB (esata bios limitations) and proceeded to corrupt the data once the 2TB limitation was reached due to the bios software wrap around.

A raid card that allows you to build your own NAS in your own computer had caused read/write errors due to a bad driver and this wasn't discovered until all the data was moved (copied and erased); one in every five files were corrupted. Thus hardware that depends on the proper software drivers is not a good option since you have to re-install the Operating system plus locating the proper drivers for the new system.

The only reliable external storage system is one that you attach directly to Esata or usb and uses the file structure of your operating system. Thus if all else fails, you can remove the drives from the external casing and plug it directly to a computer to read off your data.


The trouble with most Raids (a method of keeping your data redundant) is that in order to protect your data, the raid device has to keep two copies of your data; thus if you lose one copy due to a hard drive failure, you can recover from the second copy. This means that only half of the total space is usable; the other half is a copy of the original.


The beauty of Raid 5 is that it only uses only one third (instead of half) of the space of your hard drive to protect your data. I've been trusting the Raid 5 method since the late 90's, however all that has changed recently as I realize how vulnerable my data is.


Raid 5 whether it is software or hardware driven, depends totally on your current version of software or hardware; thus if the computer fails or the NAS device fails, the data is held ransom until you are able to rebuild/repair or buy another unit that reads the data.


Your data is vulnerable with all the Raid methods since they are all software/hardware dependant. Software Raid does give you portability from system to another, but you will still require to re-install the software before you can access your data, however if you change software version again you will run the risk of data corruption. Also what if the NAS or raid card manufacturer no longer exists? Obtaining drivers or a replacement unit will be close to impossible.

The only reliable Raid method is mirror and that is, only if it mirrors an identical OS file structure that your computer can read without the hardware enclosure; mirroring is where an identical copy of your data is written to a second hard drive simultaneously.


Now remember, when I talk about mirroring, this isn't the hardware dependant mirroring where most NAS units uses linux ext2 or ext3 file structure to store your data. If you are using a MAC and are able to mount the hard drive volume, then you are in luck, however do check that you are able to mount the hard drive volume separately from the NAS unit.


With the Mediasonic HUR1-SU2S2, a dual raid bay unit, any data that is written to one hard drive is duplicated on the second drive. Pulling a hard drive out and simply inserting one drive into a docking port shows that I am able to access the data on that drive; the identical data is stored on the other drive.

The beauty of this setup is that it is hardware driven but is not proprietary.

Recently, I had a chance to test out my new backup system when one of my dual drive enclosure failed. Stipping the enclosure away from the hard drives, I inserted the drives into the the dock to check the data; both drives shows the data was intact. Since the enclosure was only two days old, I was able to exchange for another enclosure immediately.

Since I have two copies of the hard drive, I decided to keep one on the shelf for backup and use the other one as the main hard drive and inserted a brand new drive for mirroring instead. The unit proceeded to mirror the drive without a hiccup.

Twelve hours later, the mirror process completed; however during the whole process, I had read/write access to the original data.


Finally, after all these years, I have found the perfect solution.


Although it is costly in the sense that I am doubling my storage cost but at least
- I am not quadrupling my storage requirements by keeping redundancy on my main storage as well as my backup.
- I am also completely free of any operating system dependencies, so no software raids and no software drivers.
- I am free to install any operating system without the fear of a new driver corrupting my existing data, as well I am free from using any hardware dependant units.
- I can plug the drives directly to the computer or use an external USB drive enclosure allowing the data to become portable.


Every photo is stored twice automatically; thus

- if one of my hard drive fails in the middle of a wedding edit, mediasonic will automatically switch to the second hard drive.

- if my system fails, I can move my mediasonic device to another computer and continue editing up to the minute last save.

- if my mediasonic device fails, I can disassemble the unit and plug one of my drives into a docking port and continue editing; again up to the minute last save.

As with all cliches, this is such a win/win solution that I am totally stoked and simply had to share.

I can honestly say that this magic trick uses mirrors.

No comments:

Post a Comment