Saturday, March 17, 2007

A challenge!

I just got a question on asktom recently. To paraphrase, the question is basically:

We copy some data from another table. After the copy, some of the data is missing. When we check the source table, the data is there. What can be the reason for this????

Well, here is the code - after you read the code, I want you to GUESS as fast as possible what the problem PROBABLY IS. If you are a long time reader, I hope you get it "fast" :) My reply is here, but take the challenge first. Please, ignore the really bad "transaction processing" - yes it is *part* of the problem, but try to guess really fast why this code is most likely BROKEN without any visible errors.

Comment away..........

CREATE OR REPLACE PROCEDURE "SIEBEL_CUST"
IS
CURSOR siebel_cust_cur IS
SELECT *
FROM customerdata_vw
WHERE account_id NOT IN (SELECT cust_account_id
FROM nidapps.CUSTOMERS);

TYPE icust_tabarr IS TABLE OF customerdata_vw%ROWTYPE INDEX BY BINARY_INTEGER;

icustarr icust_tabarr;
insert_count NUMBER := 1;
i NUMBER := 1;
file_id UTL_FILE.FILE_TYPE;
BEGIN
OPEN siebel_cust_cur;
file_id := utl_file.FOPEN( 'h:\nidbatchprog', 'ston-c.log', 'w' );
LOOP
EXIT WHEN siebel_cust_cur%NOTFOUND ;
FETCH siebel_cust_cur INTO icustarr(i);
utl_file.PUT_LINE( file_id,i);
INSERT INTO CUSTOMERS (customer_id,cust_account_id,customer_name)
VALUES (customers_seq.NEXTVAL,icustarr(i).Account_ID,icustarr(i).Account_name);
i := i + 1;
COMMIT;
utl_file.PUT_LINE( file_id,icustarr(i).Account_name);
insert_count :=insert_count + 1;
IF insert_count = 1000 THEN
COMMIT;
insert_count := 1;
ELSE
insert_count := insert_count + 1;
END IF;
END LOOP;
dbms_output.put_line(i);
CLOSE siebel_cust_cur;
utl_file.PUT_LINE( file_id,'Customer migration completed');
dbms_output.put_line('for cursor');
utl_file.fclose(file_id);
EXCEPTION
WHEN OTHERS THEN NULL;
END;
POST A COMMENT

31 Comments:

Anonymous Anonymous said....

Wow.... How many things can you do wrong in one PL/SQL block. Let me count the ways...

Sat Mar 17, 09:14:00 PM EDT  

Anonymous jp said....

Well, apart from the dreaded "When others then null" potentially dropping bad records on the floor (you know, silently, so they don't bother anybody), there's the little problem of the last less-than-1000 inserts not being commited...

Sat Mar 17, 11:48:00 PM EDT  

Blogger Thomas Kyte said....

there's the little problem of the last less-than-1000 inserts not being commited...

Not really..

that commit every 1,000 records is just a little "positive reinforcement" that the prior 1,000 COMMITS "take"

...
i := i + 1;
COMMIT;

utl_file.PUT_LINE( file_id,icustarr(i).Account_name);
insert_count :=insert_count + 1;
IF insert_count = 1000 THEN
COMMIT;
insert_count := 1;
ELSE


they are committing every insert already :)

Sun Mar 18, 12:04:00 AM EDT  

Blogger Vivek Gandhi said....

Hi Tom

Quickly looking at the code, I see that insert_count is being incremented twice.

insert_count :=insert_count + 1;
IF insert_count = 1000 THEN
COMMIT; insert_count := 1; ELSE
insert_count := insert_count + 1;

Sun Mar 18, 02:11:00 AM EDT  

Blogger HP Fuchs said....

:-))
I think somebody wanted to make fun of Tom Kyte

Sun Mar 18, 02:47:00 AM EDT  

Anonymous Anonymous said....

Just read your follow up to the code on asktom. Talk about LOL :D

Sun Mar 18, 03:43:00 AM EDT  

Anonymous A O'Hara said....

I think the key is "some data is missing" - so since "i" is always incremented, the table grows without limit, until no more memory is available, an exception is thrown and ignored by "when others then null", and so the remaining data is silently not inserted.

I've also noticed that "exit when cursor%notfound" is before the fetch, shouldn't it be after ?

Sun Mar 18, 05:38:00 AM EDT  

Anonymous Anonymous said....

The dreaded "WOTN" strikes againĀ²

Sun Mar 18, 07:27:00 AM EDT  

Blogger Thomas Kyte said....

"exit when cursor%notfound is before the fetch"


OPEN siebel_cust_cur;
LOOP
EXIT WHEN siebel_cust_cur%NOTFOUND ;
FETCH siebel_cust_cur INTO icustarr(i);


that actually would generate more data - but could be one of the underlying causes...

It'll process the last record twice - they probably had a unique index on there someplace and noticed the last batch of 1,000 or less records (or 500 - for there is that double increment of I) went missing quite predictably.

That is likely why we see the addition of the commit of each record - in addition to the legacy "let's commit each batch" code.

Because they insert the last record twice....


ops$tkyte%ORA10GR2> declare
2 cursor c is select * from dual;2007
3 type array is table of dual%rowtype index by binary_integer;
4 l_rec array;4 7 10
5 begin 2 5 8 11
6 open c; 6 9 12
7 loop
8 exit when c%notfound;
9 fetch c into l_rec(1);
10 dbms_output.put_line( l_rec(1).dummy || ' processed...' );
11 end loop;
12 close c;
13 end;
14 /
X processed...
X processed...

PL/SQL procedure successfully completed.


Now, if there were no records, it would work by "accident" in their case since they are using a plsql table - the fetch would fail to fill in the table variable, and the access of it would raise no_data_found, which of course would be silently ignored by the when others then null

Sun Mar 18, 07:29:00 AM EDT  

Anonymous A O'Hara said....

So now that they commit at every insert, it's either a resource problem (in-memory table too big) or, even more likely, at least one dirty row in the source table - that causes an error upon loading (a "cannot insert null" would suffice) and terminates silently the load because of "when others then null".

Sun Mar 18, 08:09:00 AM EDT  

Anonymous SwitchBL8 said....

At least they've put the Exception at the highest PL/SQL level, so it'll stop the procedure at the first error.

Apart from that this routine reminds me of the SQL and PL/SQL routines I've seen that were created by SAP consultants. Must be something in the area of work. They don't know sh*t about the database and how to make use of it. SAP and Siebel consultants should NOT be left alone with an SQL-prompt. Guard them, and tell them upfront you're licensed to kill.

Sun Mar 18, 09:17:00 AM EDT  

Anonymous Rish G. said....

Is it because of the classic use of NOT IN instead of NOT EXISTS? Assuming that the source table has null account ids, the cursor should have been created using the NOT EXISTS clause.

Sun Mar 18, 04:45:00 PM EDT  

Blogger Gary Myers said....

After the loop, they've got both a DBMS_OUTPUT and a UTL_FILE to say the job has completed. How can they NOT realise that, if they don't have those lines, it must have jumped to the exception handler.
Sign of the times, no-one has mentioned ora-1555 as the potential hidden culprit. undo_retention is proabbly hiding a lot of sins.

Sun Mar 18, 07:06:00 PM EDT  

Anonymous Anonymous said....

Took me longer than it really should. I'll start reading PL/SQL code in reverse order from now on.... :)

Sun Mar 18, 07:28:00 PM EDT  

Blogger Mark said....

I thought the same thing as Rish G. If they have null ids in the source table, then NOT IN could cause some records to be missed, even when the WOTN is removed.

Sun Mar 18, 10:31:00 PM EDT  

Blogger Chi Hoang said....

I had an idea while I was reading your challenge; I've been following you long enough to know your number one pet peeve :) It must have driven you nuts and ruined your day :)

Mon Mar 19, 02:51:00 AM EDT  

Anonymous Anonymous said....

probably santosh
only wanted to
program in a "safe" way
?

Mon Mar 19, 05:01:00 AM EDT  

Blogger Paul Moore said....

Sadly, I was so nauseated by the commit logic, that I never got as far as the "when others then none".

I'm not sure whether that means I failed the challenge, or I win for stopping before permanent brain damage set in :-)

Mon Mar 19, 12:20:00 PM EDT  

Anonymous MDinh said....

That is funny!

Being a long time Tom fan, it took me no longer than 1 second to scroll down to the bottom of the code to find Tom's #1 complaint.

Mon Mar 19, 01:04:00 PM EDT  

Anonymous Martin said....

The last sentence you typed made me 'know' the answer you were looking for without even looking at the code, that hint was just to clear for me. I'm starting to know what you loath after all these years you've been around
;-)

My guess is the guy is missing all the even numbered records. That should have given him a pretty good hint what to look at...

Mon Mar 19, 01:07:00 PM EDT  

Blogger Jonathan said....

This code leaves the cursor open across commits. However, given the other errors in the code that others have already pointed out this is likely the least of their worries.

Mon Mar 19, 04:26:00 PM EDT  

Blogger Jared said....

The problem is likely NULLs in cust_account_id.

Now I will go look at your answer. :)

Mon Mar 19, 08:12:00 PM EDT  

Blogger Jared said....

Ha!

I should have read all the code.

WHEN others then
null

Yup, that's uually a bad idea.

The only time I can find that I have used that to good purpose was in a logon trigger to set 10046 events on.

If there was a problem setting 10046 in the trigger, I did not want the error raised.

Mon Mar 19, 08:16:00 PM EDT  

Anonymous Robert said....

I spotted it really fast
I'm getting smarter just by reading this blog haha

Tue Mar 20, 10:17:00 AM EDT  

Blogger DarkAngelStrike said....

You must submit this on WorseThanFailure. If you don't, i will ! =)

http://worsethanfailure.com/

Tue Mar 20, 04:34:00 PM EDT  

Blogger hajamanharate said....

Hi tom i am the one who gave that query. i am banging my head from past few days with management to change that insert count logic

and let me clarify one more thing,

the records which are missing does not contain null values. and as far as i know the unique key constraint violation is also not there

Wed Mar 21, 03:13:00 AM EDT  

Blogger Thomas Kyte said....

hajamanharate said...

if you do nothing else - just remove

exception
when others then null;


and all will be revealed to you. You are hitting an error. The error is being hidden - made to disappear.


Let the error be shown to you....

Wed Mar 21, 06:36:00 AM EDT  

Anonymous Anonymous said....

dbms_output.put_line

... wood silent exploding - array too small - then

EXCEPTION
WHEN OTHERS THEN NULL;

takes the Rest

Danimilkasahne

Wed Mar 21, 11:40:00 AM EDT  

Anonymous Anonymous said....

A programmer at my company found a more creative way to code
EXCEPTION
WHEN OTHERS THEN NULL;

Here it is:

BEGIN
BEGIN
A select statement here
EXCEPTION
WHEN OTHERS THEN
GOTO getout;
END;
BEGIN
IF a condition
THEN
an update statment
END IF;
EXCEPTION
WHEN OTHERS THEN
GOTO getout;
END;
<<getout>>
NULL;
END;

Fri Mar 23, 11:51:00 AM EDT  

Blogger Thomas Kyte said....

A programmer at my company found a more creative way to code

Ok, you WIN.

that rocks... extra points for creativity :)

Fri Mar 23, 12:11:00 PM EDT  

Anonymous Anonymous said....

Very Good article , this article make some interesting points.
Tactical Flashlights
r c helicopter
video game
Tactical Flashlight

Mon Jun 23, 02:45:00 AM EDT  

POST A COMMENT

<< Home