Wednesday, July 06, 2005

How cool is this part II

Taking a deeper look at the LOG ERRORS option for DML in 10g Release 2.

Ok, so I’ve done the insert component of the testing and found something rather dramatic.  We’ll let this one unravel as we go through it.  I set up a test to compare

  • normal insert that could fail and just undo everything
  • insert with LOG ERRORS
  • procedural code with exceptions

and how each would go.  I did it with three levels of constraints:

  • no constraints (beyond NOT NULL)
  • a check constraint
  • a check plus primary key constraint

and in two modes:

  • direct path for the inserts (conventional of course for the row by row)
  • conventional path all around.

In short – direct path, LOG ERRORS rocks.  Conventional path, we might have an issue, something to get incrementally tuned in the implementation.  But remember, this is still pre-production, we’ll have to come back yet again later.  The test used a table structured like this:

create table t2
as
select *
  from big_table
 where 1=0;

alter table t2
add constraint t2_check
check (object_id>0);

alter table t2
add constraint t2_pk
primary key(id);
                                                                               
begin
 dbms_errlog.create_error_log
 ( 'T2' );
end;
/

and there were corresponding T1 and T3 tables for the regular insert and the procedural code (no error log tables were created for them).  I would comment out the constraints or not depending on the test being performed.  The code used looked like this:

insert /*+ APPEND */ into t1
select * from big_table;

insert /*+ APPEND */ into t2
select * from big_table
LOG ERRORS
REJECT LIMIT UNLIMITED;

DECLARE
 type array is table of 
 t3%rowtype index
      by binary_integer;
 data   array;
 errors NUMBER;
 dml_errors EXCEPTION;
 l_cnt number := 0;
 PRAGMA exception_init
       (dml_errors, -24381);
 cursor c is
    select * from big_table;
BEGIN
 open c;
 loop
   fetch c
      BULK COLLECT
      INTO data
      LIMIT 100;
   begin
      FORALL i IN 1 .. data.count
              
SAVE EXCEPTIONS
        insert into t3 values data(i);
   EXCEPTION
   WHEN dml_errors THEN
     errors :=
        SQL%BULK_EXCEPTIONS.COUNT;
     l_cnt := l_cnt + errors;
     FOR i IN 1..errors LOOP
       dbms_output.put_line
       ('Error iteration ' ||
   SQL%BULK_EXCEPTIONS(i).ERROR_INDEX ||
        ' Oracle error is ' ||
   SQL%BULK_EXCEPTIONS(i).ERROR_CODE );
     end loop;
   end;
   exit when c%notfound;

 END LOOP;
 close c;
 dbms_output.put_line
 ( l_cnt || ' total errors' );
end;

Now, using a modified runtstats utility that allows for 4 different test cases, and running the above with 1,000,000 rows (BIG_TABLE is a copy of ALL_OBJECTS with a primary key field ID added) I observed:

  • direct path insert, no error logging, 9 seconds elapsed, 4.66 cpu seconds
  • direct path insert, error logging, 8.78 seconds elapsed, 4.78 cpu seconds
  • procedural code, 19.36 seconds elapsed, 18 cpu seconds

That was with all constraints enabled.  Results were similar with one constraint/no primary key and no constraints.  Then, I turned the append hint into xappend – and used conventional path inserts.  What a difference an X makes:

  • insert, no error logging, 10.1 seconds elapsed, 7.36 cpu seconds
  • insert, error logging, 47.6 seconds elapsed, 42.55 cpu seconds
  • procedural code, 21 seconds elapsed, 17.99 cpu seconds

Hmm.  The big difference is the amount of undo generated – 163m for the insert with error logging, 28/29 meg for the insert/procedural code.  Additionally, the insert/error logging did about 16million latch gets, the others did in the 1 to 1.5 million range.  The big difference was in the cache buffers chains latch – it (the error logging) hit the buffer cache much more frequently.

So, so far, this looks really promising for direct path loads, but we’ll have to look into the conventional path further.

POST A COMMENT

27 Comments:

Anonymous John Spencer said....

Tom:

This is very cool. The fact that there seems to be no significant performance hit for direct path loads with error logging is excellent news, although I might be prepared to pay a reasonable penalty for the functionality.

One question, if I do the INSERT /*+ APPEND */ LOG ERRORS thing, then SELECT from the error table and find a large number of errors, can I rollback the INSERT and keep the records in the error table?

I only recent;y downloaded R1, I guess I am going to have to warm up my broadband again.

Thanks
John

Wed Jul 06, 11:56:00 AM EDT  

Blogger Thomas Kyte said....

can I rollback the INSERT and keep the records in the error table?

Good question, appears they are inserted in the equivalent of an autonomous transaction:

ops$tkyte-ORA10GR2> create table t1
2 as
3 select *
4 from scott.emp
5 where 1=0;

Table created.

ops$tkyte-ORA10GR2> alter table t1 add constraint t1_check check (sal>900);

Table altered.

ops$tkyte-ORA10GR2>
ops$tkyte-ORA10GR2> begin
2 dbms_errlog.create_error_log
3 ( 'T1' );
4 end;
5 /

PL/SQL procedure successfully completed.

ops$tkyte-ORA10GR2> insert into t1
2 select * from scott.emp
3 LOG ERRORS
4 REJECT LIMIT UNLIMITED;

13 rows created.

ops$tkyte-ORA10GR2>
ops$tkyte-ORA10GR2> select count(*) from t1;

COUNT(*)
----------
13

ops$tkyte-ORA10GR2> select count(*) from err$_t1;

COUNT(*)
----------
1

ops$tkyte-ORA10GR2> rollback;

Rollback complete.

ops$tkyte-ORA10GR2> select count(*) from t1;

COUNT(*)
----------
0

ops$tkyte-ORA10GR2> select count(*) from err$_t1;

COUNT(*)
----------
1

Wed Jul 06, 12:10:00 PM EDT  

Blogger Tim... said....

Tom:

Cool, but it does mean some consideration before launching full-on into a change over.

By the way, the first sentence reads, "Taking a deeper log at the LOG ERRORS option for DML in 10g Release 2." Did you "look" not "log"?

Thanks for the info!

Wed Jul 06, 01:53:00 PM EDT  

Blogger Thomas Kyte said....

Cool, but it does mean some consideration before launching full-on into a change ove

That is something I think can be said about everything!

Thanks for the "log" point out there -- fruedian slip :) Was thinking about LOGGING ERRORS

Wed Jul 06, 01:58:00 PM EDT  

Anonymous John Spencer said....

Tom:

Thanks for checking. The autonomous transaction on the errors table makes it way beyond cool.

I can almost see an automated ETL tool. Stuff the data in using LOGGING ERRORS, check the error table, (possibly) rollback if too many errors, then use the information in the error table to automatically fix the source data.

I assume this works with MERGE as well?

John

Wed Jul 06, 02:06:00 PM EDT  

Blogger Thomas Kyte said....

Insert, Update, Delete, Merge -- they are all there for this:

ops$tkyte-ORA10GR2> create table
2 emp as
3 select empno, ename, sal from scott.emp where 1=0;

Table created.

ops$tkyte-ORA10GR2> begin
2 dbms_errlog.create_error_log
3 ( 'EMP' );
4 end;
5 /

PL/SQL procedure successfully completed.

ops$tkyte-ORA10GR2> alter table emp add constraint check_sal check (sal > 900);

Table altered.

ops$tkyte-ORA10GR2>
ops$tkyte-ORA10GR2> merge into emp a
2 using scott.emp b
3 on (a.empno = b.empno)
4 when matched then update set ename = b.ename, sal = b.sal
5 when not matched then insert (empno,ename,sal)
6 values (b.empno,b.ename,b.sal)
7 LOG ERRORS
8 REJECT LIMIT UNLIMITED;

13 rows merged.

Wed Jul 06, 02:21:00 PM EDT  

Blogger Alberto Dell'Era said....

Could it be that the insert "Xappend" got turned (internally) into 1,000,000 single-row inserts ?

I mean, if you do the math, the redo overhead per row has been:
(163M - 28M) / 1M = 135 bytes
which is about the size of the "begin/end redo" record pair (around 200 bytes, at least in 9i, but that would be similar in 10g).

That would explain the increase in "buffer chains latch" gets as well .

Wed Jul 06, 02:27:00 PM EDT  

Anonymous Scot said....

Looks like a very nice feature, thanks for the demo.

Does it work for system type errors, such as running out of undo space or even running out of space in the target data table (provided of course the error table was in a different tablespace that did have space left). Is it good for ANY error?

And, do records inserted into the error table generate redo and/or undo?

Wed Jul 06, 02:34:00 PM EDT  

Anonymous Anonymous said....

Tom,
This new feature(dbms_errlog.create_error_log) looks really cool. I am wondering if this could potentially be a perfomance issue especially when you insert a million records at one shot.

I am also thinking about calling this procedure from a sql*loader. Will that logging still work?

--Srikanth

Wed Jul 06, 03:00:00 PM EDT  

Blogger Thomas Kyte said....

Could it be that the insert "Xappend" got turned (internally) into 1,000,000 single-row inserts ?

comparing to a single row program, the insert with log errors is much better. so say you insert 1,000,000 rows one at time,

236.67 elapsed seconds
225.01 cpu seconds

and hey, there you go -- same big amount of undo, latches like crazy (26million)

Yes, it looks like log errors is a much improved row by row "internally" in some respects, many optimizations but many of the implications of row by row processing (so it is a faster row by row processor)

Does it work for system type errors, such as running out of undo space

That is already covered by resumable statements! That was done in 9i. The same facility exists.

Wed Jul 06, 03:14:00 PM EDT  

Blogger Thomas Kyte said....


--Srikanth said...


The performance implications are in the blog -- direct path loading - away you go. conventional path loading, not as good.

sqlldr would log to the BAD file as before (eg: you need to add "log errors reject..." to the SQL, it doesn't "just happen")

Wed Jul 06, 03:18:00 PM EDT  

Blogger Alberto Dell'Era said....

comparing to a single row program, the insert with log errors is much better. so say you insert 1,000,000 rows one at time,
...
latches like crazy (26million)


Is perhaps big_table ordered on disk by "id", ie by the PK of t1,t2,t3 ?
If yes, I wonder what would be the results if it got randomly "disordered" (re-created using a CTAS with "order by dbms_random") ...

Wed Jul 06, 04:07:00 PM EDT  

Anonymous Gabe said....

John Spencer said:

I can almost see an automated ETL tool. Stuff the data in using LOGGING ERRORS, check the error table, (possibly) rollback if too many errors, then use the information in the error table to automatically fix the source data.


Hi John:

I doubt that automatic fix part. Conceptually this is quite doable today ... all they did with this new feature is to automate, simplify, relax some processing constraints [but in a tight controlled way] ... like not failing the statement because of just one row, maybe performance improvements. Responding to the individual errors still requires human intervention and/or complex coding for error resolution. No magic there.

Wed Jul 06, 05:56:00 PM EDT  

Blogger Salem G. said....

Very nice indeed,
but, when used in direct mode, it does not take into account errors like missing FKs, (just as sqlldr does not in direct mode).

Rgds

Wed Jul 06, 06:17:00 PM EDT  

Blogger Thomas Kyte said....

it does not take into account errors like missing FKs,

catch 22, you cannot be in direct path insert mode AND have FKs enabled at the same time.

append is a hint and it'll be ignored utterly with enabled foreign keys.

So, the fkeys will be validated, but you'll be using a conventional path insert.

Wed Jul 06, 06:20:00 PM EDT  

Blogger Colin 't Hart said....

R2 (for Linux) is now available. See otn.oracle.com

Thu Jul 07, 05:21:00 AM EDT  

Blogger Jurij said....

Thomas Kyte said:
Thanks for the "log" point out there -- fruedian slip :)


Or even freudian one! ;)

Thu Jul 07, 07:32:00 AM EDT  

Anonymous John Spencer said....

Gabe said:

I doubt that automatic fix part. Conceptually this is quite doable today


Gabe:

I did say I could almost, see automated ETL :-)

I agree that it is conceptually possible to do it today, but you have to roll your own for most of it. In most case, human intervention would probably be neccessary, but there are cases where fixes could be automated. For example, if you assume the source data is correct, and you are missing a FK value, you could just add it based on the values in the errors table.

Tom:

Sorry to highjack your blog.

John

Thu Jul 07, 10:49:00 AM EDT  

Anonymous Anonymous said....

Is it just me or has anyone else noticed that the "About Me" section of Tom's blog now starts after the last piece of comment (last comment on the comments page or last entry on the entries page)?

Thu Jul 07, 11:05:00 AM EDT  

Blogger Thomas Kyte said....

Is it just me or has anyone else noticed that the "About Me"

appears to be IE, the code samples are "pushing it away", doesn't happen on real browsers like firefox.

I tried scrunching the code in - but it didn't make IE happen, it'll pass in a day or two unless I keep adding code :)

Thu Jul 07, 11:20:00 AM EDT  

Anonymous Gabe said....

John Spencer said …

John,

I apologize if I misunderstood your comment.

I guess my drive was: yes, the feature is nice and opens some new (programmatic) avenues … but as far as dealing with the hard data-cleanliness problems, no, I don’t see any substantial benefit. We have the errors in the ERR$_% table … now what? Anything different after this point?

In fact, I don’t even see this proliferation of ERR$_% tables as an advantage … in ETL, we don’t really/generally purge rows from the source tables if they fail … so a ROWID/PK to point to the data would suffice … in which case, whatever columns are left from this ERR$_% tables could be consolidate into just one error table… which is what we (certainly I) did before, didn’t we? If I need a report/extract to show the errors with the data then I only need to join.

Anyway, I already said I should wait and see/test it for myself rather than assume … better live to that promise. Just downloaded 10gR2 overnight … new feature, new gizmos … ever wonder “isn’t it too much”? … I see 1% of the features being used in 99% of the cases.

Tom … sorry for the tresspasing.

Thu Jul 07, 04:13:00 PM EDT  

Anonymous John Spencer said....

Tom:

Last trespass.

Gabe:

No apologies required, as always, I value your comments. I agree that this probably won't help with the "hard" cleanliness problems, but some of the simpler ones maybe. Also, I would be willing to bet that what Oracle does in the core product is faster than anything I could write to accomplish the same task, and certainly leaves me with much less code to write and maintain.

Unfortunately, I'm going to have to wait awhile to get R2, my Linux PC at home died yesterday :-(.

By the way, you must have very sophisticated users or vendors if you see as much as 1% of the features being used at any time.

John

Thu Jul 07, 04:48:00 PM EDT  

Anonymous Duke Ganote said....

DML Error logging can be configured: either for stopping/rolling back after any defect, or plowing thru all defects:
http://tinyurl.com/2y56ro
I'm a zero defect (ZD) guy myself.
http://en.wikipedia.org/wiki/Zero_Defects

Mon Oct 01, 10:06:00 PM EDT  

Anonymous Anonymous said....

Can this feature be used in multi table insert

Mon Oct 27, 09:50:00 AM EDT  

Blogger Thomas Kyte said....

@Anonymous

documentation says....

yes, you can

Mon Oct 27, 10:02:00 AM EDT  

Anonymous Anonymous said....

I tried with multi table . Please let me know , what am I doing wrong in this .


SQL> create table t1 as Select * from all_objects where 1 = 2;

Table created.

SQL> create table t2 as Select * from all_objects where 1 = 2;

Table created.

SQL> exec dbms_errlog.create_error_log('t1');

PL/SQL procedure successfully completed.

SQL> exec dbms_errlog.create_error_log('t2');

PL/SQL procedure successfully completed.

SQL> INSERT
2 /*+ APPEND PARALLEL(t1 , 5) PARALLEL(t2 , 5) */
3 ALL
4 WHEN ( OBJECT_TYPE ) <> 'TABLE' THEN
5 INTO t1
6 WHEN ( OBJECT_TYPE ) = 'TABLE' THEN
7 INTO t2
8 Select * from all_objects ;

53022 rows created.

SQL> rollback;

Rollback complete.

SQL> INSERT
2 /*+ APPEND PARALLEL(t1 , 5) PARALLEL(t2 , 5) */
3 ALL
4 WHEN ( OBJECT_TYPE ) <> 'TABLE' THEN
5 INTO t1
6 WHEN ( OBJECT_TYPE ) = 'TABLE' THEN
7 INTO t2
8 Select * from all_objects
9 log errors into err$_t1 reject limit unlimited
10 log errors into err$_t2 reject limit unlimited;
log errors into err$_t1 reject limit unlimited
*
ERROR at line 9:
ORA-00933: SQL command not properly ended



SQL> INSERT
2 /*+ APPEND PARALLEL(t1 , 5) PARALLEL(t2 , 5) */
3 ALL
4 WHEN ( OBJECT_TYPE ) <> 'TABLE' THEN
5 INTO t1
6 WHEN ( OBJECT_TYPE ) = 'TABLE' THEN
7 INTO t2
8 Select * from all_objects
9 log errors into err$_t1 , err$_t2 reject limit unlimited;
log errors into err$_t1 , err$_t2 reject limit unlimited
*
ERROR at line 9:
ORA-00933: SQL command not properly ended

Mon Oct 27, 10:18:00 AM EDT  

Blogger Thomas Kyte said....

that was why I pointed you to the wire diagram, the syntax is there, it shows the error logging clause following the values clause for each of the tables.

Like this

ops$tkyte%ORA10GR2> create table t1 ( x int, y int );

Table created.

ops$tkyte%ORA10GR2> create table t2 ( x int check (x>0) );

Table created.

ops$tkyte%ORA10GR2> create table t3 ( y int check (y>0) );

Table created.

ops$tkyte%ORA10GR2> exec dbms_errlog.create_error_log( 'T2' );

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2> exec dbms_errlog.create_error_log( 'T3' );

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> insert into t1 values ( 1, 1 );

1 row created.

ops$tkyte%ORA10GR2> insert into t1 values ( -1, 1 );

1 row created.

ops$tkyte%ORA10GR2> insert into t1 values ( 1, -1 );

1 row created.

ops$tkyte%ORA10GR2> insert all
2 when (y>0) then into t2(x) values (x) log errors into err$_t2 reject limit unlimited
3 when (x>0) then into t3(y) values (y) log errors into err$_t3 reject limit unlimited
4 select * from t1;

2 rows created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> column xc format a10
ops$tkyte%ORA10GR2> column yc format a10
ops$tkyte%ORA10GR2> select cast( 't1' as varchar2(15) ) tag, to_char(x) xc, to_char(y) yc from t1 union all
2 select 't2', to_char(x), null from t2 union all
3 select 't3', null, to_char(y) from t3 union all
4 select 'err$_t2', x, null from err$_t2 union all
5 select 'err$_t3', null, y from err$_t3
6 order by 1
7 /

TAG XC YC
--------------- ---------- ----------
err$_t2 -1
err$_t3 -1
t1 1 1
t1 -1 1
t1 1 -1
t2 1
t3 1

7 rows selected.

Mon Oct 27, 10:30:00 AM EDT  

POST A COMMENT

<< Home