Thursday, January 24, 2008

Why recovery is the only thing...

Why recovery is the only thing a DBA is not allowed to get wrong. I've made the claim that a DBA can mess everything else up, but mess up recovery and - well, you cannot call that person a DBA.

Backups are important. I recently pointed to a really nice warning screen - every application should have that.

This recent news article points out why multiple redundant (and distributed) backups are relevant. My favorite quote:

"The lesson to be learned here is that you can't depend on having just one set of records or files and having your employees have access to them. You've got to have some kind of backup," Jefferson said."
That is a lesson you don't want to learn 1 minute after the event... Fortunately for that company they got their stuff back - not via backups (there were none) but by disk forensics (unerase).

I wonder what would have happened when the disk failed? (not if, when). In hindsight their disgruntled employee did them a favor in a way - now they likely have a rock solid recovery plan in place.

I have a 250gb, 320gb external disk. I constantly image my laptop and desktop machines to both. I synchronize my desktop and laptop as well. Always afraid of the drives failing. Only a matter of time. I travel with the 320gb backup device.

Just this week, my daughter spilled a mug of hot chocolate in her laptop. Guess what happened :)


Have you tried a recovery today?
POST A COMMENT

9 Comments:

Blogger Daniel Fink said....

Some years ago, I was reading a data warehousing book by Gary Dodge and Tim Gorman and came across a simple sentence that changed how I looked at backups. I don't recall the exact wording, but the basic idea is 'The responsibility of the dba is not to backup the database, but to restore it.' It was an epiphany.
The shame is that there are so many opportunities for an organization and dba to practice recovery. Refreshes, new server installs are two that are rather common. And you can do these recoveries without the pressure of a down production system and with a decent amount of sleep.

And for my personal machine...I synchronize my laptop with an external drive. And that external drive is backed up to another external drive. Since I seem to reinstall Windows every 6 months and I often pull old files from my external drives, I know they are good...at least for now. Tomorrow is always another story.

Thu Jan 24, 09:47:00 AM EST  

Anonymous Glenn said....

At work, I always back up the database (using RMAN of course ;) to disk and then the system administrators back the disk to tape. I have just had really bad luck over the years recovering files from those backup tapes (and this is across a couple different companies and different hardware equipement).

At home, I back my stuff up to an external harddrive and just recently starting using Apple's Timemachine. Timemachine adds a coolness to backing up and makes it easy enough for anyone with a Mac to do - expecially with the new Airport wireless router/drive. I will soon start using a online backup company (Mozy?) to backup my photos. The photos are the only thing I truly do not wnat to loose.

Thu Jan 24, 11:04:00 AM EST  

Anonymous Anonymous said....

My comfort level is only achieved when I know I have (at least) three copies of backup sets on (at least) two different media in two different locations (in addition to RMAN we're still taking once per week user managed backup - to guard against RMAN bugs).
Twice per month we practice RMAN recovery (on dedicated machine, with big cheap SATA disks) of all our databases. Guess what happens with RMAN backup sets used during test recovery? They're backed up too and kept as long term archive.
Yes, I confess, I spent way too much time with IBM folks in the past ;-)
Regards,
Ales

Thu Jan 24, 01:17:00 PM EST  

Blogger Karen said....

Hey, nice beard, bro! YA HA HA!

Thu Jan 24, 02:41:00 PM EST  

Blogger DB Stuff said....

How ironic... here I am at 1:15am waiting for a database to restore so I can re-setup replication (sadly, we still have some other-than-oracle databases I have to take care of).

Killer is, the issue was actually a san firmware problem that took out our whole san - thought that wasn't possible?. Thankfully, the backups weren't on the san, and I was able to restore a few databases to the point in time that the san failed.

I do wish, however, that I had tested applying 50+ transaction logs so that I would have realized that a script would make things much faster (remember-had to use that non-oracle database). Lesson learned - you don't just need backups, you need to walk through your restores every once in a while to make sure you have the process down.. :)

Fri Jan 25, 04:21:00 AM EST  

Anonymous darl kuhn said....

It's interesting that Oracle named their backup and recovery tool "Recovery Manager", not "Backup Manager". Yes, it is critical to be able to perform backups, but DBAs earn their money when they are able to successfully restore and recover a database after a failure...

Fri Jan 25, 10:12:00 AM EST  

Anonymous Anonymous said....

I'm wondering what the deal is with zip files. Specifically, the 10gR2 hpux Itanium zip download. I unzipped (winzip) it on a pc, then uploaded it to hpux with a normally rock-solid gui ftp program in binary mode. The ftp blew up consistently in the same place. On closer investigation I noticed why: there was a zip in the zip (Aurora java something), and rather than uploading it as a file, it made the zip file a directory and barfed. At the beginning of the files it also had a zero length css file that seemed to hang it for a long time. Wassup?

ftp'ing the original zip and using the Oracle provided unzip worked just fine.

Which just goes to show, you can have all the tested recovery procedures with rock-solid software, and a slight change of procedure can still mess you up.

Sat Jan 26, 07:52:00 PM EST  

Blogger Doug Cowles said....

I see an increasing problem from my vantage point that at some places it is not clear who or what entity exactly is responsible for the integrity of the backups, it gets lost in red tape. That is a big problem I think.

Sun Jan 27, 02:31:00 PM EST  

Anonymous Alex said....

I know fine tool for restore data-zip recovery,able scans the archive, detects the data structure and tries to recover as much information from the corrupted file as possible, using several different recovery algorithms, the tool makes it possible to achieve the minimal loss of useful data stored in the corrupted archive,restore data from corrupted media (floppy disks, compact disks, Zip drives and others).

Mon Feb 18, 06:43:00 AM EST  

POST A COMMENT

<< Home