Tuesday, June 27, 2006

I've enjoyed the "what the???" sites...

I’ve enjoyed the “what the??” sites – both the “Oracle” and “IT in general” one. Today – I’m going to post two of my own “what were they thinking”.

I received this function in the mail this morning. It is a neat one. Can you quickly tell what the purpose is? (I’ve renamed all variables, they were less meaningful before the change – did not want anyone grepping their code to see if it was one of their functions)

CREATE OR REPLACE
FUNCTION convert_date( the_date_string IN CHAR)
RETURN DATE
AS
date_to_return DATE;
yyyy_num NUMBER(4);
mm_num NUMBER(2);
mm_char CHAR(2);
dd_num NUMBER(2);
BEGIN
IF the_date_string IS NULL
THEN
date_to_return := NULL;
RETURN date_to_return;
ELSE
yyyy_num :=
TO_NUMBER(SUBSTR( the_date_string,1,4));
mm_num :=
TO_NUMBER(SUBSTR( the_date_string,5,2));
dd_num :=
TO_NUMBER(SUBSTR( the_date_string,7,2));
IF mm_num > 9
THEN
mm_char := mm_num;
ELSE
mm_char := '0' || mm_num;
END IF;
date_to_return :=
TO_DATE(yyyy_num || mm_char||
dd_num,'YYYYMMDD') ;
RETURN date_to_return;
END IF;
END;
/


Yes indeed – the functional equivalent of:

To_date( the_date_string, ‘YYYYMMDD’ );


Interesting. The other one was a multi-page (many pages) query. The correlation names used in this query?
  • Aaaaaa
  • Bbbbbb
  • Cccccc
  • Aaaa
  • Bbbb
  • Cccc
  • Dddd
  • Aa (many times!! For many different tables in inline views)
  • Bb (same as aa!)
  • Cc

I told the person that sent it to me that I felt I was working on the human genome project with DNA sequences, not a SQL query. Best yet was this part in the query that looked like this:


from
( select aa.c1, aa.c2,
aa.min_this, aa.max_that,
bb.cnt_distinct
from (select c1, c2,
min(this) min_this,
max(that) max_that
from t
where c3 between 'X' and 'Y'
and c4 between 'A' and 'B'
group by c1, c2 ) aa,
(select c1, c2,
count(distinct something) cnt_distinct
from t
where c3 between 'X' and 'Y'
and c4 between 'A' and 'B'
group by c1, c2 ) bb
where aa.c1 = bb.c1
and aa.c2 = bb.c2
) aaaa


Now, does anyone else see an obvious performance enhancing technique they might apply to this? Maybe we just move the count(distinct something) up into the AA inline view, lose BB altogether and remove the join…

Not only that, but that inline view – referenced multiple times (block copied) – so it was performance drag times 3. Remove the join – turn into a “with” subquery (execute it once and reuse it) and, well, things are better.
POST A COMMENT

9 Comments:

Blogger Doug Burns said....

I once saw a beautifully hand-crafted INSTR() but my favourite was a site that had spent hundreds (maybe thousands) of man-days writing what was, in effect, advanced queueing. When I mentioned this, they hadn't heard of it. Easily done, but *so* expensive and *so* frustrating to see.

To say nothing of manual foreign key constraint checking!

Tue Jun 27, 05:55:00 PM EDT  

Blogger Howard J. Rogers said....

I once had a (I think it was) Sybase user burst into tears on one of my courses when I showed him Oracle's deadlock detection at work. He'd just spent 3 weeks writing the same sort of functionality for his Sybase application.

Talk about uncomfrtable! I didn't know where to look, actually!

Tue Jun 27, 06:34:00 PM EDT  

Blogger Q u a d r o said....

Ok, i'm think this one will win the prize. Anyone can guess what is the purpose of this function?

FUNCTION GET_TABLE_DATA_ID( IP_DATA_ID IN TABLE_DATA.DATA_ID%TYPE, IP_DATE IN DATE, OP_INDEX IN OUT BINARY_INTEGER )
RETURN TABLE_DATA.DATA_ID%TYPE
IS
GI BINARY_INTEGER;
GI_LO BINARY_INTEGER := NULL;
GI_HI BINARY_INTEGER := NULL;
BEGIN
IF GT_TABLE_DATA.COUNT = 0 THEN
RETURN IP_CLNT_ID;
END IF;
IF OP_INDEX IS NOT NULL AND GT_TABLE_DATA.EXISTS( OP_INDEX ) THEN
GI := OP_INDEX;
ELSE
GI := GT_TABLE_DATA.FIRST;
END IF;
GI_LO := GT_TABLE_DATA.FIRST;
GI_HI := GT_TABLE_DATA.LAST;
WHILE GI IS NOT NULL
LOOP
IF GT_TABLE_DATA( GI ).CLNT_ID = IP_CLNT_ID THEN
LOOP
IF GT_TABLE_DATA( GI ).START_DATE < IP_DATE AND GT_TABLE_DATA( GI ).END_DATE >= IP_DATE THEN
OP_INDEX := GI;
RETURN GT_TABLE_DATA( GI ).CLNT_ID_1;
ELSIF GT_TABLE_DATA( GI ).START_DATE >= IP_DATE THEN
GI := GT_TABLE_DATA.PRIOR( GI );
ELSE
GI := GT_TABLE_DATA.NEXT( GI );
END IF;
EXIT WHEN ( GI IS NULL ) OR ( GT_TABLE_DATA( GI ).CLNT_ID != IP_CLNT_ID );
END LOOP;
EXIT;
ELSIF GT_TABLE_DATA( GI ).CLNT_ID < IP_CLNT_ID THEN
EXIT WHEN GI = GI_HI;
GI_LO := GI;
GI := GI + ROUND( ( GI_HI - GI ) / 2, 0 );
ELSE
GI_HI := GI;
GI := GI - ROUND( ( GI - GI_LO ) / 2, 0 );
EXIT WHEN GI = GI_LO;
END IF;
END LOOP;
RETURN IP_CLNT_ID;
END GET_TABLE_DATA_ID;

The answer - they loaded table data into PL/SQL associatove array and used this function in *JOINS* instead of base table (for whatever reason...). You can even spot binary search algorithm :)

The over one i saw recently - the developers refused to use ORDER BY. Instead they grab the intire thing to the client and do soap bubbles sort. Performance was terrible at best.

Tue Jun 27, 09:02:00 PM EDT  

Anonymous Anonymous said....

The first really bad "query" I "tuned" was in 1999.

It was a script to calculate weekly inventory summaries of some kind, for the last 6 months. So it scanned the biggest table in the database 26 times (yes, once per week). Solution was to do it with a single scan and a group by. It came down from 20 hours to about 30 minutes. And I did not even know of PX, DFMBRC, sort area sizes in those days (or maybe the version did not have all of these - don't know).

Wed Jun 28, 12:03:00 AM EDT  

Anonymous glenm said....

quote
human genome project with DNA sequences, not a SQL query
/quote

Had to explain to fellow workers why I was laughing so hard. Have seen this kind of thing so often, now you've given it a name - DNA aliasing. I can't stand it when people use a,b,c in real code.

Wed Jun 28, 03:03:00 AM EDT  

Blogger Kim Berg Hansen said....

Hmmmm... convert_date does have a function:


SQL> select to_date('2006 628','YYYYMMDD') from dual;
select to_date('2006 628','YYYYMMDD') from dual
*
ERROR at line 1:
ORA-01843: not a valid month


SQL> select convert_date('2006 628') from dual;

CONVERT_D
---------
28-JUN-06


If month has space rather than zero for month 1-9, convert_date works.

Don't know where the data comes from, but if sufficiently obscure... :-)

Wed Jun 28, 09:12:00 AM EDT  

Blogger Thomas Kyte said....

Ok....

1* select to_date(replace( '2006 628',' ','0'),'YYYYMMDD') from dual
ops$tkyte@ORA9IR2> /

TO_DATE(R
---------
28-JUN-06

Wed Jun 28, 09:24:00 AM EDT  

Blogger Kim Berg Hansen said....

Didn't say convert_date was the best way to solve the problem of spaces ;-)

Wed Jun 28, 09:33:00 AM EDT  

Blogger Thomas Kyte said....

Just realized the replace thing might not work if the data is seriously "botched"

"20061 1 "

so....


1 select :x, convert_date(:x),
2 to_date(substr(:x,1,4)||lpad(rtrim(substr(:x,5,2)),2,'0')||substr(:x,7), 'yyyymmdd') td
3* from dual
ops$tkyte@ORA9IR2> /

:X CONVERT_D TD
-------------------------------- --------- ---------
20061 1 01-JAN-06 01-JAN-06

Wed Jun 28, 11:47:00 AM EDT  

POST A COMMENT

<< Home