Tuesday, March 20, 2007

Recovery is...

Recovery is the only thing you really really need to be able to do.  As a DBA.  You can mess lots of other stuff up - but get to where you need recovery and then discover that - you cannot - that is a really bad feeling.

Funny how small the world is.  I was just reading this article about a disaster in Alaska with some data.  I was going to write about that - but got interrupted.  When I came back - there was a new article in my blog list so I glanced at it.  It was from Howard Rogers - and it was about... recovery.  And it referenced basically the same article (hosted on a  different server).  And it was from as far away from me as you can get and still be on this planet...

Read about Howard's nightmare here.

We've all had those nightmares happen to us at one point or another.

It is all about recovery.

Too bad Howard wasn't running 10g - flashback table T to before drop;

Too bad I didn't have a longer undo_retention (I do now...).

Too bad the Alaska Department of Revenue didn't test their backups.

Just hoping some people are "scared" now and - well - test their recovery capabilities.  Because it is all about recovery.  We can fix anything else anyone does wrong - but if you cannot recover - that is money, money down the drain...



Anonymous Anonymous said....

Oh how I love that flashback facility and at least two hours of undo retention. Made me a hero several times now. Developers do a bulk delete for customers (yeah, yeah, I know), and then come to me to pull their #($@(*& out of the fire. I always say something like 'Well, I don't know. That would be a major operation to restore to a remote server, do point in time recovery, export and import the data.
Then 10 minutes later all is fixed. Usually worth some small gifts of food and drink.
Trouble is, now they are starting to expect it -- makes them more reckless, it seems.

Tue Mar 20, 05:26:00 PM EDT  

Anonymous Anonymous said....


Howard and the Alaska thing brought back nightmarish memories of way long ago.

I'm sure you know the story...it's late one night after being on the keyboard for near twenty-four hours straight, blood chemistry is basically Mountain Dew, and I mistyped one letter for a three-letter schema name in a drop database command.

And in so doing dropped the production database at 0200!

Fortunately, no one was on the system but me at the time and there were no transactions needing recovery...just applying the backups and the transaction logs.

I did eventually 'fess up to it but the fact that Production didn't notice anything out of the ordinary until I went to confessional was testimony to the backup procedures.

They were certainly better than at one client's...but saving their bacon is another story for another time, I think. :)

Tue Mar 20, 05:43:00 PM EDT  

Anonymous Anonymous said....

A DBA at my office sent out a note about that horror story. As soon as I read it I knew you were going to post a blog entry about it.

Reminds me of those "Want to get away"? airline ads.


Tue Mar 20, 08:37:00 PM EDT  

Blogger Fahd Mirza said....

As you may remember, some months ago you included your whole blog posts in the your feed upon my request as I was unable to access your blog due to the ban on blogspot domain in my area.

The ban is over and I am back to blogging. I just wanted to thank you for what you have done for me by your blog and asktom.

I have paid a humble tribute to you in my blog spot at the following link:



Wed Mar 21, 01:33:00 AM EDT  

Blogger Hariharan Ragunathan said....

Hi Tom,

When you said about recovery, we face d a same problem 6 months ago.By mistake a support guy truncated one of the production table which stores all the login id of around 12 lakh customers.Since he was using TOAD and by mistake he pressed the truncate buttton in the toolbar.Due to fear, he did not inform the same and we came to know only after 20 minutes when complaints started increasing. We use oracle 9iR2 and the cold backup was took 3 days before and we bought that data into the table. It took around 11 hours to retrieve the data from hot back up. But lot of temporary fixes and loss due to the same.

Wed Mar 21, 02:17:00 AM EDT  

Blogger Edgar said....


But Howard is talking about dropping tablespace not a table. So the flashback command wouldn't help him with his issue.

I think that every dba has this kind of horror story.

Wed Mar 21, 07:50:00 AM EDT  

Anonymous Pierre M. said....

One problem I constently faced with this issue is getting the time to reherse a full time, full size scenario. You don't know it works unless you've done it, at least once.
If I'm responsible of the backup, I make sure we have a drill (full recovery from last daily backup on separate system) once every three months or so.
Unfortunately, management always find excuses to NOT practice recovery (takes time, use a similar computer to practice, waste valuable ressources, etc.).
When I teach DBA stuff, I always make the joke that I don't care about backup. I care about recovery.

Wed Mar 21, 09:10:00 AM EDT  

Anonymous RobH said....

Flashback table is a useful feature only if you were smart enough to move all your tablespaces to locally managed. The feature does not work with dictionary managed.

Lesson there

Wed Mar 21, 10:00:00 AM EDT  

Anonymous neil on wheels aka dbms_cycle said....

We end all of our backup scripts with RESTORE DATABASE VALIDATE;

Additionally, we test our backups by using RMAN duplicate database. Hopefully those count :)

Wed Mar 21, 11:35:00 AM EDT  

Blogger David said....

Yes, memories indeed. I can recall "practicing" exporting/dropping/importing tables for re-org (back in 7.3 days) in TST, only to get a phone call to check on something in PRD. Ok, off the phone, where was I...oh yeah, export done, time to drop. GL_BALANCES. (Oracle Financials 10.7SC). No, recovery wasn't tested at the time. Yes, it worked.

As an (Oracle) instructor said to me a long time ago, If you can't recover your database, you'd better be able to recover your resume - you'll need one or the other...

Wed Mar 21, 04:16:00 PM EDT  

Anonymous Harry said....

So, how do you guys test your recovery procedures?

Thu Mar 22, 12:56:00 PM EDT  

Blogger Thomas Kyte said....

So, how do you guys test your recovery procedures?

by doing them, by performing a recovery.

That test environment that needs to be refreshed from production? Perfect case study for a restore.

That development environment? Restore it from production.

Restore bits of production from time to time, just to make sure you can.

Run the rman commands to validate your backups.

Read your tapes.

Verify your archives can be applied.

Thu Mar 22, 01:01:00 PM EDT  

Anonymous dizwell said....

As someone else has said here, I think, "drop tablespace" is significantly different from a mere "drop table" -and the recovery's a darn'd site harder too! Flashback wouldn't have helped me (except flashback database, which wouldn't have been acceptable in any case).

Funny thing: the last self-made disaster I had was in 1985 when I met my very first IBM PC and started working my way through the manual. I got to the page on the format command and tried it out on the C: drive before turning the page to discover what it actually did... and no, you couldn't cancel it. Bang went the only copy of a client's entire audit records for that year!

Still, I suppose 22 years before drinks is not a bad batting average, just to mix my metaphors horribly!

Thu Mar 22, 04:31:00 PM EDT  

Anonymous dizwell said....

between drinks, even...

Thu Mar 22, 04:34:00 PM EDT  

Blogger Q u a d r o said....

Even with a "flashback database" you still need to get dropped datafile(s) from "somewhere" (datafile backup or recreate from archivelogs if they are available from the beginning).

The above means that you can restore them. No - flashback database wouldn't help here either.

Thu Mar 22, 10:00:00 PM EDT  

Anonymous Anonymous said....

At this site you can compare data recovery quotes, I haven't seen any other sites that does that, was very valuable in my case.


Sun May 13, 11:48:00 AM EDT  


<< Home