Understanding...
I was reading around the web and read an article that was talking about the EXCEPTIONS INTO clause – a feature whereby when you enable a constraint, Oracle will record the rows that do not pass the constraint check into an exceptions table. It works with unique, primary key, check and so on. The article seemed to be indicating “exceptions into was broken” – but it seemed more likely to me, that this was a case of not understanding how it actually works.
So, I contacted the author and we had a back and forth email exchange. Basically, they came back with “the documentation isn’t good enough, this could really hose someone up”. Well, I’d like to do a poll. I’m curious. What I’d like you to do is read the documentation, the SQL Reference guide had this to say:
exceptions_clause
Specify a table into which Oracle places the rowids of all rows violating the constraint. If you omit schema, then Oracle assumes the exceptions table is in your own schema. If you omit this clause altogether, then Oracle assumes that the table is named EXCEPTIONS. The exceptions table must be on your local database.
----------------------------------------------
And the Admin Guide had more. Now, after reading that (don’t run the example! Just answer the poll based on the reading or your past knowledge of using this command), can you fill in the blanks of this example? We insert two rows into T with the same value – when we enable a primary key with exceptions into, how many rows will be “exceptional”?
ops$tkyte@ORA10GR1> create table t ( x int );
Table created.
2 owner varchar2(30), table_name varchar2(30),
3 constraint varchar2(30));
Table created.
ops$tkyte@ORA10GR1> insert into t values ( 1 );
1 row created.
ops$tkyte@ORA10GR1> insert into t values ( 1 );
1 row created.
ops$tkyte@ORA10GR1> alter table t add constraint t_pk primary key(x)
2 EXCEPTIONS INTO exceptions;
alter table t add constraint t_pk primary key(x)
ERROR at line 1:
ORA-02437: cannot validate (OPS$TKYTE.T_PK) - primary key violated
ops$tkyte@ORA10GR1> select count(*) from exceptions;
COUNT(*)
----------
?????
Warning: The following poll will display to you advertising which is totally unrelated to this site:
Quote of the Day….
Not that anyone should care by now (we are in 10g times after all) ... but that "Hierarchy" package presented last year as a 8i method for doing what sys_connect_by_path does is a bug waiting to happen. One needs to understand how it works in order to use it safely.
I thought – how perfect, that sums up a lot of things. I use this slide in a build – the last line morphs into:
One needs to understand how Oracle works in order to use it safely.
But I thought – hey, that makes Oracle sound “special”, so the slide morphs into:
One needs to understand how ANYTHING works in order to use it safely.
There we go. Unless you understand a command, have read the documentation, have tested it, have tried it – you probably don’t get it. Understanding comes from
- Reading
- Testing
- Implementing
To assume something is broken because it didn’t work the way you thought it should is wrong. And, given this person had at least as many years experience with Oracle as I did, goes to show – if you want to provide advice, back it up, show what you say is true. We now have another piece of material on this thing called the internet with a myth on it. Because someone had never used a command before, didn’t read the documentation on it, and assumed how it should work – given the problem he had to solve. Recipe for disaster.


45 Comments:
That is why I only use ...
1. asktom.oracle.com
2. tom's books
3. oracle documentation
4. metalink
5. books from other reputable authors (JLewis, CMillsap, etc)
as sole references.
I have stumbled across too many 'believe me, I'm an expert' books/articles earlier in my career - now, I just ignore those (unless I see some 'proof'.
Talking about Oracle document, sometime it is not that easy to get the right information people need.
Today, I just find out that
To_date('1999', 'YYYY') gives
you a date with year 1999 and current Month and date. It was indeed a surprise to me because I was expecting 19990101.
When I went to Oracle Docs, I could not find out any information on it.
Any ideas?
Talking about Oracle document
You are correct, it does not seem to explicity state what the default values for the month, day, year, hour, minute, second will be if not specified.
examples seem to show that:
year default to current year if not supplied.
month defaults to current month if not supplied
day defaults to 1 (month end would prevent it defaulting to current day I support) if not supplied.
hour/minute/second default to 0 if not specified.
I filed this bug this morning:
...
the SQL reference manual does not document the expected return from a query
such as:
select to_date('2000','yyyy') from dual;
since the default values to be used for the day, year, month, hour, minute
and second are not documented.
chapter two of that manual should specify that the default values unless
specified in the input string are:
year = current year (from sysdate)
day = 01 (first day of month)
month= current month ( from sysdate)
hour/minute/second = 0
....
Tom,
An unfair poll. You make "of all rows violating the constraint" bold, thus influencing your audience. You should have made it the same font as the rest of the text -- just like the documentation
An unfair poll.
I agree, Ok, unbolded.
the votes were:
5 for 1 / 26 for 2 / 2 for > 2
when I made the change.
Not only is this the way that the clause works, but it's the only way that it ought to work ...
http://oraclesponge.blogspot.com/2005/04/exceptions-into.html
Actually I think I agree with Mike. Yes, as David mentioned, there is no concept of a "ordinal" row, but it wouldnt have hurt to have the EXCEPTIONS INTO clause treat the first row it encounters as the "first" and put the others into the exceptions table. This would make "deduping" a table trivial using this feature.
delete from mytab where rowid in (select row_id from exceptions)
Instead of doing the complicated SQL that is currently needed to de-dupe a table!
Thanks
About "de-duping": the first row encountered is not necessarily deterministic, so I don't think you can go that route.
I answered "2" only because I deduced that Oracle wouldn't go through the trouble of an ORDER BY ROWID to eliminate dupes. I figured Oracle would want to keep it deterministic, and that's too much work.
Also, I'd imagine Oracle wants to give you a choice of which row to keep. (The two rows could be in different partitions, based on another column, for example.)
That said, the documentation needs to clear this up. It's not obvious at all.
anonymous wrote:
first row it encounters as the "first" and put the others into the exceptions table. This would make "deduping" a table trivial using this feature.
The table would end up with random data. Imagine for a second that your account number, which happens to be the primary key, gets duplicated in a bank's system. The only difference in the rows is in non-key attributes (account owner, etc.) Would you care very much about which row got deleted, and would you accept the banks explanation if it said that your row just happened to be the second one it encountered?
You might be able to determine that the cause of the error is duplicate rows, but just because two rows have the same ID number does not mean that they are dupes. Oracle is doing the right thing -- showing you the rows that you need to look at to figure out why the constraint won't work. You need to figure out what to do about it -- the answer will not always be 'delete where rownum = 1' (or my perennial favorite, 'where rownum = 2' !!!).
I can't believe this has ever become an issue. The behaviour is as it has been since at least Oracle 7.3, after all.
And how Oracle is supposed to know which of the duplicates is the "right" and which is the "wrong" one, Lord only knows. Claims by anyone that Oracle "ought" to do anything to *resolve* duplicates is just silly... at least until Oracle Corporation perfect that dbms_mind_reading package.
Another example of Ault writing first and checking second, I'd say.
It's not intuitive but, think about it three times, it's the only way it could work in general. Intuition is not enough.
Mike gave: http://download-west.oracle.com/docs/cd/B10501_01/server.920/a96521/general.htm#13282
technically correct, but doesn't do a very good job in explaining - that's why we need books (not just copy & paste from the manuals).
It seems to me there is a documentation inadequacy which adds to the confusion: "All rows that violate a constraint must be either updated or deleted from the table containing the constraint."
Shouldn't that be "All except one . . . " in this case?
Putting it another way: no single row voilates the constraint by itself; it's the SET of duplicates that is in violation.
>> it wouldnt have hurt to have the EXCEPTIONS INTO clause treat the first row it encounters as the "first" and put the others into the exceptions table <<
As Ajay says this is just not practical ... the result would be non-deterministic. There's nothing to say that the row that is first encountered is the row that first possessed the duplicated value at all, and even if it were then there's nothing to say that this is the row that ought to be kept.
I was going to suggest a "keep first" extension to 'exception s into', but lets face it, anyone that doesn't want to take a 5 seconds of effort to add a 'row_number' clause their exceptions table to get the 'first' one, probably wouldn't take the time to discover a 'keep first' enhancements in the docs anyway :-)
I feel like this confusion is probably generated by the misconception that whatever goes into the exceptions table can be discarded or is going to be automatically repaired.
In the past we had to manually come up with the logic and the code to deal with what the 'EXCEPTIONS INTO' clause does now. So I welcome an clause which is going to offer a quick and elegant way of doing that.
However what this clause does not do is to go through the 'discarded' records and find out why they broke the rules. But that was not within the scope of this clause in the first place. I find it difficult to understand how someone could have deduced that from the documentation.
I feel like this confusion
but the exceptions table does tell you "why", each row is tagged with the constraint is violates
and sql being sql -- sets being sets - rows not having any sort of "ordinal" meaning, all rows that violate an integrity constraint when present together -- violate the integrity constraint.
I think that is the other thing they overlooked -- their goal "de-dup" a table leaving just one row behind. EXCEPTIONS INTO -- is there to identify rows that violate a constraint. It was much like using a hammer to put a screw into a piece of wood.
You have some problem with my approach to woodworking? ;)
"It was much like using a hammer to put a screw into a piece of wood."
I was writing a comment on Mike's blog and I used those exact words for the analogy. I replaced it with immigrants who ate banana peels because they had never been shown how to eat bananas.
I also think the way it works now is the only way it should work : it is up to human beings to know which rows are right, and which are wrong, can't and won't ever be automated.
I agree with DaPi, this sentence is quite confusing though : "All rows that violate a constraint must be either updated or deleted from the table containing the constraint." and without the little example just following it, I would have doubted of the way it actually worked. Of course in order to kill any doubt I would have tried it and see what happened, but I still think that this sentence should be clearer.
At what point was the exceptions into clause ever meant to be a way to actually MODIFY data, as opposed to a method to merely IDENTIFY data that should be manually reviewed during a migration or data cleansing project? This is nothing new...
Those complaining about oracle documentation should try to read/understand DB2 documentation ( no offense intended to DB2 users )! Oracle is miles ahead in clarity/ease of use........
Excellent poll, Tom. After reading the documentation, I was confident that there would be 2 rows in the exception table. However, I second-guessed myself thinking that the answer is different from the obvious, and maybe that's why you posted your question. In other words, all the points of using a scientific method to construct an argument went out the window, and I followed a completely irrational approach of "He's up to something, even though I have no information supporting that he's up to something". Needless to say, my guess was completely wrong. However, this did teach me a valuable lesson as to why I should have stuck to my first response.
Thanks,
Arun
However, I second-guessed myself thinking
that is why I dig test cases so much...
evidence.
Exception - One that is excepted, especially a case that does not conform to a rule or generalization.
Many believe the "exception" is the data counter to the rule. Should all the data be the exception? If the rule is there will be only ONE 'foo' value in my key. All data that violates that rule is the exception. If I have 10 foos, 9 of them violate that rule.
For a lot of reasons already mentioned, that is not the way it works, but the confusion results from the definition of exception and people's idea about sequential processing of the enabling of a constraint. Each new value I see is ok, every time I see a value I already have, it is an exception to my rule. Whoever was first is not semantically part of the exception.
It's OK that Oracle defines it differently than the dictionary. :)
Exception - One that is excepted,
if you have 10 duplicates they are all exceptions to the rule when looked at collectively.
When looked at INDIVIDUALLY, none of them are exceptions. If and ONLY IF you look at them together are the ALL exceptions to the RULE.
They are all counter to the rule when they all exist simultaneously. There is no first, there is no second, there is no Nth occurence, the all exist "simultaneously"
If this were an array, I'd agree with you. This is a SET, not an array, the entire concept of "ordinal" doesn't even exist.
The set of rows that collectively violate a constraint are the exceptions to the rule. That is the set of rows that violate the constraint when looked at collectively.
If you were loading this table row by row - you would get what you describe, but only because the SET is ok with the first occurence, the SET becomes NOT ok with the second. But if you load up a table and say "turn on this constraint", those same two rows would both be in violation.
Oracle Documentation is crucial and I believe too few Oracle wannabes use it. Too bad.
I knew what the answer was because I've written PL/SQL code to reconcile a set of records provided by the exception clause. However, the documentation makes it pretty clear that's what you will get.
On another note, voting popped up a page from screensavers.com. If I'm not mistaken, this is one of the websites hit last week with a $1.85 billion lawsuit from NY State for spyware/adware misdeeds. Beware!
Oracle Documentation is crucial and I believe too few Oracle wannabes use it. Too bad.
I knew what the answer was because I've written PL/SQL code to reconcile a set of records provided by the exception clause. However, the documentation makes it pretty clear that's what you will get.
On another note, voting popped up a page from screensavers.com. If I'm not mistaken, this is one of the websites hit last week with a $1.85 billion lawsuit from NY State for spyware/adware misdeeds. Beware!
Just curious.
Why wouldn't the alter table command implcitly commited the data prior to enabling the constraint?
yes, but the question was "how many rows would the alter table put into the EXCEPTIONS table..."
now how many rows are in the source table you are putting the constraint on.
Doh!!
I read the docs better than your question. I read the docs and saw Rollback and put transposed the tables in my brain.
Seems to me that that the table being alter should not be messed with in a manner that would screw-up (ie cause a mismatch) the rowids in the exception table.
Am I correct or is my brain still "transposed".
I've had heated arguments with developers over this one, why it makes sense to use the exceptions table, and secondly why it works the way it does.
We were working on a data warehouse loader, and they were wondering why loading 2 million rows was taking all night. Turns out, to eliminate dupes in the feed (it was a large & messy reference table), they kept all the unique & PK constraints on the table, and letting them reject them one by one!
I suggested to "disable novalidate" constraints / direct path load / "enable validate" constraints with "exceptions into".
"But exceptions is broke! It gets rid of all of almost all of our rows!"
That's by design.
"Well that's awful!"
Think sets, not records!
"Then we have to write all these difficult to understand deduping logic and need to create temporary tables, etc. It's so difficult!"
I don't agree. It's a tradeoff. On a table with a few thousand rows, I'd probably preserve the constraints let it do a conventional load. Up the ante to a few million, god help you. Just get on with it - write the extra two or three statements to do the deduping & re-insertion.
I don't think any of them even knew the feature existed, to be honest, I had to give them links to Tom's site to convince them. sigh.
I picked two, but, I also knew that because you were asking the question, that there was a potential mistake that could be made.
I think the previous poster--"Think sets, not rows" had it right. It's is fairly easy, because of other programming concepts, to have the mental model that what you're doing is iterating through each row and only addding it to a list of exceptions if upon trying to add it to the index you get an error.
For what it's worth, it wouldn't hurt in the documentation to bold the word "all."
I wouldn't rely on the exceptions table to assist in removing dupes anyway.
I guessed right but had that slight doubt in the back of my mind. So did what Tom would have done. Tried the example and :)
I try catching up with asktom.oracle.com but sometimes tend to forget what I read.
I found that when I try it out myself with an example, I tend to remember it.
Is quality (completeness, clarity) of documentation related to how often the feature is used?
I never encountered customers or collegues using this feature. And everyone understands what it is, but rarely implements it. Oracle wouldn't get many questions about it which could indicate flaws in the documentation.
I knew the feature, and I can imagine using it, but I just don't. A bit of "I want to use it - it's useful, but I don't want to implement it - it's awkward"
Bart
Is quality (completeness, clarity) of documentation related to how often the feature is used?
Sure it is, but I don't think it necessarily applies in this case.
ALTER TABLE T ADD CONSTRAINT...
Ok, how many of you issue that command once a day? Once a week? Once a month? Never?
Now, how many of you do that against a table full of data? (even less).
Now, how many of you have a reasonable expectation that against this full table, you expect the constraint to go on "ok" and if it doesn't -- you'd like the rows to be identified so you can then scrub the data and then enable the constraint?
Now we are in a rather small set of people. This is not a general purpose function that is used every second by someone. It is something that -- by the fact it is in a 'alter add constraint' -- would be used very infrequently.
But it is documented, it is easy to test, it is easy to understand what it does.
I think if you polled a room of experienced DBA's, and asked them have you used:
Advanced Replication -- probably get the same ratio.
Audit command -- I would hazard a guess lots haven't used that either.
Alter system suspend, probably a 0% hit in most cases.
Alter system quiesce would likewise generate a small hit I would suspect.
but that doesn't mean the documentation is wrong/bad -- it means the command is of use to a small amount of people.
Many people have only used the add constraint against an empty table.
Or they didn't bother to even see that there was an exceptions into (majority of the case probably)..
Specify a table into which Oracle places the rowids of all rows violating the constraint. If you omit schema, then Oracle assumes the exceptions table is in your own schema. If you omit this clause altogether, then Oracle assumes that the table is named EXCEPTIONS. The exceptions table must be on your local database.
I'll probably take some flack for this, but I personally feel that the documentation
1) Is technically correct in that it unambiguously states the expected behavior, but
2) Was poorly worded.
The relational concept is not an intuitive one - people don't naturally think of things in terms of unordered sets, but rather as chronological events.
So, people who haven't been working with RDBMS's for long (or people who have, but whose Monday morning cup of coffee hasn't kicked in yet), will intuitively look at your example "relation" {1, 1} thinking: "a row that was entered which didn't violated any constraints, and another row that was entered which *did*".
Personally, when I document anything that works remotely counter to human intuition, I take the time to supply one or two examples to back it up. Oracle Technical Writers should hold themselves to no lower a standard - mis-interpretations of their documentation can be just as expensive to a corporation as, say, a bug in the CBO.
Case in point, try repeating your poll with this:
Specify a table into which Oracle places the rowids of all rows violating the constraint.
For instance, if two rows share a value for a column, and the table is altered to place a unique constraint on that column, the ALTER TABLE command will fail and the ROWID's of both offending rows will be placed in the exceptions table.
If you omit schema, then Oracle assumes the exceptions table is in your own schema. If you omit this clause altogether, then Oracle assumes that the table is named EXCEPTIONS. The exceptions table must be on your local database.
I feel like this confusion
I just wanted to qualify what I meant when I said, 'what this clause does not do is to go through the 'discarded' records and find out why they broke the rules'.
I was not referring to the broken constraints, which are in fact specified for each row in the EXCEPTIONS table.
My thoughts were going to the origin of those records. For instance, why was a record which breaks a contraint generated in the first place? And more importantly how should these records be "repaired" in order to maintain relational integrity?
I find that these investigations are actually the ones which take a long time.
Ivan Pellegrin
Note Connor McDonald's 2001 prescription here http://www.jlcomp.demon.co.uk/faq/add_primary_key.html in which he says, in part:
* Alter the primary key with the VALIDATE EXCEPTIONS INTO clause... the EXCEPTIONS table contains the rowids for the problem rows
* Remove the problem rows
The problem rose, eh?
@Duke,
not sure what you meant to convey with your last sentence?
@Tom-- The EXCEPTIONS table contains the rowids for the problem rows (say, 2 rows with duplicate key values).
* Removing the problem rows means deleting them both, eh? :)
Perhaps Oracle added the IGNORE_ROW_ON_DUPKEY_INDEX hint in 11gR2 to bypass EXCEPTIONS INTO ?
@Duke
it was added for edition based redefinition, to facilitate a "warm" or "hot" application upgrade - one in which data was being "upgraded" as well as the application code.
Say you have a table you want to massively modify for version 2.0 of your application. You are going to build a new version of it - however, you want to do a warm or hot rollover (minimize downtime) - so for a while, version 1.0 will be maintaining it's version of the table and via a crossedition trigger, the new table as well. But you need to move over all of the existing data - while the old version is modifying the table as well. You'd like to do that without having to take an outage - and you want to avoid errors (dup val on index) that would happen if the version 1.0 application inserted a row into the version 2.0 table before your mass move did.
So, you
a)put the cross edition trigger in place, version 1.0 will start populating version 2.0's table every now and then.
b) do the mass insert - ignoring any rows that version 1.0 has already moved over - you won't error out and all of the data that version 1.0 created in the version 1.0 table gets moved over - some via the cross edition trigger and some via the mass insert
It is a very special use hint, one that I can see being abused and used incorrectly in the not so very near future....
Hi Tom, Ankit here..
While loading data into a table from an intermediate table, I got following error :
ORA-00001: unique constraint (WMDWHADM.UQ_LOCHIERARCHY_ODS) violated.
Problem is that, i am not able to find out the exact contents of this Constraint. I have tried several things :
select * from dba_constraints where constraint_name like '%UQ_LOCHIERARCHY_ODS%'
--No rows--
select * from dba_cons_columns where constraint_name like '%UQ_LOCHIERARCHY_ODS%'
--No rows--
select * from dba_objects where owner='WMDWHADM' and object_type='INDEX' and object_name='UQ_LOCHIERARCHY_ODS'
-- This gave one row showing the constraint.
But still, I am not able to get the columns involved so that I can pinpoint correctly the wrong data combination for solving the issue.
Please help in this regard.
@Ankit -
you don't actually have a constraint, you have an index that was created with the UNIQUE keyword.
Query dba_ind_columns to find the columns in the index.
POST A COMMENT
<< Home