Tuesday, May 17, 2005

Rule of Thumb

Rule of thumb, also known as ROT. Good or bad?

I say, it depends. It depends on who is applying the rule of thumb. It depends on how knowledgeable and experienced they are. ROT in the hands of someone with experience, with knowledge, with lots of background - very useful. They understand when and where the ROT applies.

ROT in the hands of a novice, or someone not in the "know" - as the robot in my favorite television show growing used to say "Danger, Will Robinson! Danger". You should read that Wikipedia definition there - it holds the answer to ROT in the hands of a novice:

the catch phrase is a facetious way one person tells another that he or she is about to make a stupid mistake; that there's a factor he or she overlooked which ought to be taken into account.


I think that sums it up nicely, there is a factor they have overlooked which ought to be taken into account.

ROT: Use bind variables. Almost always true, unless
  • You are doing a data warehouse.
  • You are running a once a month batch report.
  • You are using static SQL in PL/SQL (it is impossible to not bind correctly when using static SQL in PL/SQL - use literals to your hearts content and don't be afraid!).
  • ...
But in the hands of a novice, the mantra of use bind variables taken as a mantra - becomes a problem. They'll take queries like "select * from table where type = 'XXX' and id = :x" and say "mustremoveliterals.". However, here it is perfectly acceptable to have the literal XXX, not only acceptable - but better.

ROT: Find full table scans, they are evil. Not even almost always true, but fully believed anyway.

I read your advise: FTS not always evil. But if it is FTS on Very large tables, then developers demand it to be avoided: I have the following query which takes around 16 seconds if the 'mrs' table has above 25000 records.MRS has indexes on mr_seqno, mr_number, mr_template.
===============
SELECT DISTINCT
FROM bps_users EXP,
valid_type_values rv_c,
mr_personnel_assigns
expeditor,mr_revisions,mrs
WHERE rv_c.vtyv_vtyp_type(+) = 'MR REVISION CONDITION'
AND rv_c.vtyv_value(+) = mrv_complete_cancelflag
AND mrs.mr_seqno = mr_revisions.mrv_mr_seqno
AND expeditor.mrpa_mr_seqno(+) = mr_seqno
AND mr_revisions.mrv_revno = mrs.mr_lastrevno
AND EXP.busr_id(+) = expeditor.mrpa_busr_id ;
The only thing I could say was "if that query is using a SINGLE index, something is seriously wrong - seriously totally wrong". The query was - it had one full scan and tons of nested loops (they were using the RBO). No matter what they tried, one full scan and since it was taking "long", it must be the full scan. It was more likely the LACK of full scans on the remaining tables.

ROT: Partition a table when it exceeds N rows. Not even close to being partially true, but widely believed. You partition tables based on how you use them. You might partition a 1,000,000 row table, you might not partition a 100,000,000 row table. (how big is a row anyway?).

And there are tons of other ROTs out there. In the hands of someone that has an understanding of how the database/feature works - ROT might well be OK. I admit, I use them myself - but it is personal ROT (really, it is knowledge at that point) and I use it in conjunction with other facts. For every "rule of thumb" out there - there are hundreds of cases where (going for three again)

  • It applies (it is a positive thing)
  • It does not apply (it is a negative when applied in these cases)
  • It neither applies not does not apply (it is totally neutral, neither good nor evil, but at best it is a distraction)
This is one reason why I won't myself develop a set of "rules of thumb" for publication. I will show how the software works, what it does, how it does it. I will share my experiences with it. I will show caveats all of the time (the "yeahbuts" I like). For every "rule", there is a counter case that doesn't disprove the rule necessarily - just says "it doesn't always apply". So, as the Pink Floyd were known to say "Careful with that Axe Eugene" - be careful with ROT. It'll either:
  • Be good
  • Be bad
  • Be neither good nor bad
It all depends. Think of some of the ROT you know about topics like gathering statistics, reorgs, access paths (there are people that will never use NOT IN, there are people conversely that will never use NOT EXISTS - because they had a 'bad experience' with one or the other. Neither is evil once you get to know them).
POST A COMMENT

21 Comments:

Blogger Tom Best said....

If a particular ROT were 100% true, as some would believe it to be, then they have to ask themselves, "Why doesn't Oracle code this to force it to happen all the time then?". The answer is that it would be a disaster some of the time, and there would be nothing you could do about it.

Tue May 17, 11:38:00 AM EDT  

Blogger Thomas Kyte said....

If a particular ROT were 100% true

If a ROT were 100% true, it would be a "law", "a governing force".

I'm reminded of the question "which is better -- a hash join or a nested loops"

The only reasonable answer: it depends (and if there were only one, we wouldn't have bothered implementing the other!)

Tue May 17, 12:33:00 PM EDT  

Anonymous Simon said....

ROT - Test new features always :)

Tue May 17, 01:39:00 PM EDT  

Blogger Jeff Hunter said....

ROTs, in general, are not a bad thing when applied appropriately. To apply a ROT appropriately, one has to understand the basis for the rule and the situations in which it should and should not be applied. This knowlege comes with experience and experimentation.

Tue May 17, 02:17:00 PM EDT  

Blogger Thomas Kyte said....

ROTs, in general, are not a bad thing

hey, that smells like a ROT.

This knowlege comes with experience and experimentation.

right and that is why I hesitate to give ROT out when asked for ROT and when ever I see uncaveated ROT the first thing I do is say "doesn't apply here, doesn't apply there, doesn't apply yonder either"


It is not that I've disproved the ROT, but rather caveated it (or shown why it would be a bad thing to build some sort of model on unless model takes into account all of the caveats...)

Tue May 17, 02:21:00 PM EDT  

Blogger DaPi said....

I had a go at this a while ago, and no one shot me down . . . . In praise of ROT

Tue May 17, 02:38:00 PM EDT  

Blogger Thomas Kyte said....

In praise of ROT

Hah, funny you used IN/EXISTs... (I used the anti-exists, anti-in)

I actually had a ROT for them with the RBO, but it was more a set of rules:

Q is the "outer query"

iS is the "subquery" as a result of an in.

eS is the "subquery" as a result of an exists

select * from emp
where deptno in ( select deptno from dept );

Q is "select * from emp"
iS is "select deptno from dept"

select * from emp
where exists (select null from dept where dept.deptno = emp.deptno )

Q is the same
eS is "(select null from dept where dept.deptno = emp.deptno )"


a) when Q is large use iS
b) when Q is small and iS is small, use iS
c) when Q is small and iS is large, use eS


So my ROT started with in subquery well before exists for it works in more cases ;)....

then the CBO came along, now we don't need it.

But ROT for experienced people (the ROT we build up based on our experiences) works for us -- but give that ROT to a novice without the understanding and you get ROT.

said to Novice: "use exists, they work best in general" -- becomes "thou shall use exist". Especially for the novice, expecially coming from someone with lots of experience.


Rather than

said to Novice: "Let me give you a conceptual understanding of how the RBO will evaluate a where exists, it is look a big for loop. As opposed to an IN, which is alot like a sort-distinct and a join. Now that you know that, you can think about when it would be appropriate to use one over the other"

Tue May 17, 02:54:00 PM EDT  

Anonymous Anonymous said....

Amusing...in that heuristics, or 'Rules of Thumb' are being used in artificial intelligence...maybe even for things like 'self-tuning' database management systems?

Tue May 17, 03:20:00 PM EDT  

Blogger Thomas Kyte said....

Amusing...in that heuristics

in AI, they are used in a self learning way. You have to train the model and it expects -- given different situations -- to change totally it's rule set. Eg: It (the AI engine) learns from it's mistakes.

As does the 'self tuning' stuff. Take a look at sql profiles, "hey, database run this for me and tell me about it, collect bits of information about this query and help me figure out based on non-simple rules whether we should hash or nested loops this thing"


Problem with most ROT given from person to person -- It is very simple stuff:

full scans -- bad

The supposed goal of ROT is to make the complex appear trivial but many times ends up making the trivial complex.

And uncaveated ROT, ROT without "yeahbut", scares me. I want to know the "yeahbut" -- because if there were no yeahbuts, it would just be the rule, not the rule of thumb.

Tue May 17, 03:28:00 PM EDT  

Blogger Tom Best said....

This comment has been removed by a blog administrator.

Tue May 17, 04:27:00 PM EDT  

Blogger Tom Best said....

This may be obvious, but the phrase "rule of thumb" originates from making measurements by literally using your thumb. Would you rather say "The asking price of my house is 350,000 dollars", or "The asking price of my house is a 2-thumb high stack of 1,000-dollar bills."?

Tue May 17, 04:28:00 PM EDT  

Blogger Joel Garry said....

http://www.the-robotman.com/

Tue May 17, 06:00:00 PM EDT  

Blogger Jan said....

Just wondering (while showering), when you mentioned AI and self learning, might that be a direction for the CBO? For example, i was just playing with this query that did a nested loops instead of a hash join. Of course this was because the CBO thought there would be something like 1000 rows left after applying some filters, but in reality that turned out to be more like 200000... So a histogram worked nicely in this case, but in my naivety i thought maybe the optimizer could pick up on that. "Hey, last time i though it would result in 1000 rows, but it turned out to be 200000, let's keep that in mind"...

Tue May 17, 11:26:00 PM EDT  

Blogger Thomas Kyte said....

Jan said... Just wondering

sounds a lot like 10g and SQL Profiles (or dynamic sampling hint in 9ir2 and later...)

Wed May 18, 07:15:00 AM EDT  

Blogger Matthias Rogel said....

great ROT I experienced:

when you have an Oracle problem,
try asktom.oracle.com at first.

You'll find a solution there.

If you don't, you won't find any elsewhere.

Furthermore, trust Tom.

I have followed this ROT several times for the last 2 years
and always succeeded !

Wed May 18, 10:55:00 AM EDT  

Blogger DaPi said....

a) (a bit off topic) Has anyone out there actually used, in real-life, the non-intuitive (but logical) way that NOT IN handles NULL's?

b) Not convinced about "then the CBO came along, now we don't need it." but I won't put my head in that noose 'till I've made some tests.

Wed May 18, 01:32:00 PM EDT  

Blogger Thomas Kyte said....

dapi said... a) (a bit off topic) Has

I have never seen it do another other than cause heartburn (the NOT IN null handling). There is a lot to not like about nothing -- and this is one of them.


But the CBO recognizes NOT IN and NOT EXISTS as being equivalent IF the NOT IN subquery is known to return all "not null values" and should (ok, stress should) rewrite the query both ways to see which would be more efficient given what it knows about the data.

But yes, if it makes a mistaken assumption (gets the cardinalities wrong)....

Wed May 18, 01:38:00 PM EDT  

Blogger DaPi said....

"There is a lot to not like about nothing" on the other hand see "The Book of Nothing" by John D. Barrow (professor at DAMTP, Cambridge, UK) - 300+ pages about Nothing.

Wed May 18, 05:43:00 PM EDT  

Blogger jimk said....

New book out called On Bullshit. I can see it making a greast stocking stuffer.

Thu May 19, 10:03:00 AM EDT  

Blogger Thomas Kyte said....

new book out

well, it's almost 20 years old if you mean the Frankfurt book (appears to have been a paper written in 1986 and later published as a very short book)

But, ka-ching, amazon appreciates your suggestion (as I just ordered it anyway :)

Thu May 19, 10:27:00 AM EDT  

Anonymous Chic Barna said....

"One test is worth a thousand expert opinions." - currently attributed to anonymous

Mon Aug 29, 06:31:00 PM EDT  

POST A COMMENT

<< Home