Monday, May 12, 2008

Another 'question of the day'...

This time, the question is not because it was poorly phrased.  It is because it is regarding an area I want to bring to the attention of a larger audience.

The question went like this:

 

<quote>

Suppose, I have a table having country currency combinations with a primary currency. The below is a sample data.

Country        Currency    Primary_Currency
US USD Y
US USN N
US USS N




And, I have a before update trigger on the above table for each row

(using autonomous transaction to avoid mutating error)
to check whether


the country has any primary currency. I am giving the below update.



</quote>




Ok, serious red flags here - They have a business rule apparently that AT MOST one currency can be "primary" for a given country - that is - COUNTRY must be UNIQUE when PRIMARY_CURRENCY='Y' (what I call selective uniqueness).  Now, I might suggest we are missing an entity in this model - that is, the primary currency 'flag' should not be there, but rather there should be two entities - primary_currency, and other_currencies - perhaps.  Then, the problem is simple - COUNTRY would be the primary key of the primary_currency table - and COUNTRY,CURRENCY would be the primary key of the other - an in fact, if the rule was "A country may have at most one primary currency and must have at least one primary currency" - then enforcing that more complex one would be trivial - just add a foreign key from other_currencies to primary_currency - and you are done.  Could not be easier. 



But - back to the red flags:




  • I have a before update trigger:  I have written many times that you cannot enforce integrity constraints that cross objects (tables) in the database (like referential integrity does), nor constraints that cross rows in a single table (like uniqueness does) without using the LOCK TABLE command.  Since reads are not blocked by writes - and you cannot see others uncommitted work - you need to serialize access.


  • Using autonomous_transaction:  These are almost universally "misapplied".  They have one use - in a "LOG_ERRORS" routine, called from a WHEN OTHERS exception handler (which would of course be followed by RAISE;)


  • Using autonomous_transaction to avoid mutating error:  Ok, now I know we are in serious trouble here.  If you have to use an autonomous transaction to avoid a mutating table constraint (not error, a constraint, a subtle warning to the developer "what you are trying to do is something you should not be doing") - you have a bug in your code, I'm 99.999999% certain of that - I would be very hard pressed to come up with a valid reason in real life to use an autonomous transaction to avoid a mutating table constraint.



So, we have these HUGE red flags - all of which get proven out in the next bit - everything I guessed:




  • This does not work in single user scenarios


  • This does not work in multi user scenarios


  • In short, this does not work - it is a huge bug



Are shown true:




Update Country_Currency_Ref
set is_primary_currency = 'Y'
where (country_code = 'US' and currency_code = 'USN');




And the trigger is working fine and giving the correct message that it cannot be updated as there

is already one primary currency against that country Now I update the data as below

(No primary currency)



Country        Currency    Primary_Currency
US USD N
US USN N
US USS N




and try to update the data with the below condition



Update Country_Currency_Ref
set is_primary_currency = 'Y'
where (country_code = 'US' and currency_code <> 'USD');




In this case, the trigger is failing and updating other two records. Why is this behaviour?


And how can we over come the same?




Sorry, the trigger is not failing, the trigger is just doing precisely what it was coded to do.  The problem is the person coding the trigger doesn't understand transactions and concurrency controls and how the database works.  By using the autonomous transaction to query the table - it is as if they started a brand new session to query that table.  Of course that brand new session cannot see any changes made by not yet committed!  Including the changes the trigger is trying to validate.  This is PRECISELY why we have a mutating table constraint - your row level trigger is being fired as the rows are modified - if you were able to query the table in the trigger without the autonomous transaction - you would see the table half updated (and then what, what a MESS that would be).  So, since Oracle is evil and prevents you from doing something bad here - you use an autonomous transaction - you get a consistent view of the table to be sure, but you cannot see the data you are trying to validate!!!  That makes validation pretty hard.



Even if you did the "common mutating table constraint workaround" - by deferring your reads until the AFTER (not for each row) trigger, you cannot do this check in a trigger without LOCKING THE TABLE.  You would have to serialize access in order to prevent two concurrent sessions from each creating a primary currency record at the same time. 



I would prefer a two table solution here:





ops$tkyte%ORA11GR1> create table primary_currency
2 ( country varchar2(2) primary key,
3 currency varchar2(3)
4 )
5 organization index
6 /

Table created.

ops$tkyte%ORA11GR1> create table other_currencies
2 ( country varchar2(2),
3 currency varchar2(3),
4 constraint other_currencies_pk
5 primary key(country,currency),
6 constraint must_have_at_least_one_primary
7 foreign key(country)
8 references primary_currency(country)
9 )
10 organization index
11 /

Table created.


That solves the problem quite elegantly - and even enforces the complex "must have a primary currency" constraint if you implement the foreign key.  However, if they were to keep this single table, then I would say:




drop your trigger.



and promise to never use autonomous transactions to avoid mutating table constraints (I said "mutating table CONSTRAINTS", not error - the error is you using an autonomous transaction to destroy your data integrity) ever again.



add:



create unique index only_one_can_be_primary on country_currency_ref

( case when is_primary_currency = 'Y' then country_code end );




 



I hate triggers

I hate autonomous transactions


I hate when others



If we removed those three things from PLSQL - we would solve 90% of all application bugs I think... No kidding.  I know, in the right hands, they are powerful tools.  However, they fall into the wrong hands far too often.

Wednesday, April 30, 2008

The Question of the day...

Some days... Some days the questions just make me scratch my head....

ROW TO COLUMN CONVERSION   April 30, 2008 - 5am US/Eastern

Reviewer: ROOPA from india


HOW TO CONVERT YEARLY DATA INTO MONTHLY DATA?


Followup   April 30, 2008 - 10am US/Eastern:

BY MAKING IT UP I GUESS?



Could it be more ambiguous?  I have yearly data (one presumes that is data aggregated to the level of a year).  How do I convert that into monthly data.  Short of "making it up", I have no idea... do you?



Now, they did followup later with



 



table1 format
MONTH AMOUNT_PAID
01.12.2006 00:00:00 5395
01.11.2006 00:00:00 567
01.11.2006 00:00:00 1974
01.04.2007 00:00:00 2462
01.04.2007 00:00:00 1974
01.11.2006 00:00:00 5395
01.02.2008 00:00:00 5395

table2 format
MONTH JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
01-DEC-2006 0 0 0 0 0 0 0 0 0 0 0 5395
01-FEB-2007 0 5395 0 0 0 0 0 0 0 0 0 0
01-NOV-2006 0 0 0 0 0 0 0 0 0 0 5395 0

how to convert table1 format into table2 format i.e yearly data to monthly data.


 



Now, I don't know about you - but table1 looks suspiciously like "discrete observations with an associated date - the date consisting of year, month and day".  I certainly do not see "yearly data".



I also like how they used 5,395 three times, just to make it as ambiguous as possible (wonder what happened to 567, 1,974 and so on?)  They skipped what are likely the interesting output examples - their "yearly data that is not yearly" that has more than one observation in a month.



I guess, I GUESS, their date format is DAY-MONTH-YEAR now, that changes table1 to look suspiciously like "discrete observations with an associate date - the date consisting solely of year and month".  But, we'd be GUESSSING.



And I see a 01-FEB-2007 in table2, but I see 01.02.2008 in table1.  I have to presume that is a "typo"



sigh, and there wasn't even a create table, insert into table supplied - they want me to do that.



And the output looks utterly useless.  If column 1 is "01-dec-2006", why bother having a DEC column in the output?  We already KNOW what month this is for - every row will have 11 zeros, every single one.  Seems a bit "silly".



Asking good, well formed questions is not an art, not magic.  It is however a skill.  And I find many times that when I frame my question for someone else - I find my answer.



Goes back to yesterdays post.  Writing software requires some things - a plan being one of them.  Until you can phrase your requirements in a detailed fashion - I'm not sure you know what they are or why you are doing something.....



 



We have a runner up for second place..



entire question is:




Record level Audit Old\New value same Error  April 30, 2008 - 9am US/Eastern

Reviewer:  sasirekha  from India



I have some problem using Audit Record.



Generally if we map a record to the audit Record, it will track the details of

the table insert, update, delete.



While I update the record, it will insert two different row in audit record

like  Audit Action K and N.



But both are contain the same values..



I need the old and new value.



Can any one please give me the solution with this !




questions from me:




  • what is an "audit record", must be well defined - they are using it


  • "if we map a record to the audit record" - not really sure what that means


  • "like audit action K and N" - K and N?? huh?


  • "but both are contain the same values - I need the old and new value" - well, why didn't they access the old and new values?


  • where is the sample, the example, the thing that shows us what you are really doing....





And in a close 3rd place...




How to speed up the insert and update in a partion table of more than 60 millions Rows ?



Best regards,



Sam.




I don't get it some days, just do not get it.

Tuesday, April 29, 2008

How not to do it...

Two years ago at Oracle Open World, I delivered a "worst practices" talk - how not to do stuff.  I used the word "probably" a lot, for example "you probably don't need to use bind variables" (there were slides stuck in this slide deck throughout that said 'hey, these are all the opposite of reality - just in case you are reading this - they are not true!').

In the same genre - we have "Top Ways How NOT To Scale Your Data Warehouse".  This comes from the Structured Data blog penned by Greg Rahn - that article as well as the others make for some really good reads.

While I'm pumping that blog - another one to definitely look at is Richard Foote's blog.  He's been undertaking the destruction of many a myth regarding indexing (like 'indexes like large blocks', 'rebuild when height hits N', 'Separate tables from indexes for performance' and the like).  Very easy to read, very enjoyable to read.

If you use multiple computers like I do - you might be interested in http://www.instapaper.com/ - find something you like, but don't have time to read right now - save it for later.  Very nice.

Monday, April 28, 2008

The 'write' stuff...

Not long ago - I wrote about some frustrations with the state of software 'development'.  This morning I read an article written not too long ago "They Write the Right Stuff".  I liked it a lot.  Maybe not practical or reasonable for every piece of software (but then again, why not?) - but definitely sound techniques and processes for everyone.

The sections are

  • The product is only as good as the plan for the product.  Ah, they are talking specifications, communication, documentation...
  • The best teamwork is a healthy rivalry.  Indeed!  I've said before the best was to test your ability to recover in a DBA team would be to set up two teams - one is responsible for damaging a test database in any way they want to.  The other team is responsible for recovering from that catastrophe.  Next week - switch sides.  Not only fun but very enlightening (when I poll audiences, less than 5%, way less, have done a recovery in the last six months - could it be they might not be able to today if needed?).
  • The database is the software baseNow, they did not mean the database is the center of the application itself (I would say something like that) but rather the history, change control, reason for all change is.  The history, the legacy of the code is as important as the plan for the product.  You need to understand why things were done the way they were - in order to safely change it later.
  • Don't just fix the mistakes - fix whatever permitted the mistake in the first place.  My favorite one!  How many times will I have to hear in my life "we have to do X, but you cannot tell us to use method Y to do it - Y cannot be done".  I get that all of the time - tell me how to make it go faster, but don't look at or mention touching the application.  The mistakes are typically to be found there - in the application (typically means 99.99999999999999999% give or take a small amount).

 

As an aside, anyone that knows me, knows my mantra - written many times:

  • You should do it in a single SQL statement if at all possible.
  • If you cannot do it in a single SQL Statement, then do it in PL/SQL (as little PL/SQL as possible!).
  • If you cannot do it in PL/SQL, try a Java Stored Procedure.  The number of times this is necessary is extremely rare today with Oracle9i and above.
  • If you cannot do it in Java, do it in a C external procedure.  This is most frequently the approach when raw speed, or the use of a 3rd party API written in C is needed.
  • If you cannot do it in a C external routine, you might want to seriously think about why it is you need to do it...

 

Therefore - I just loved this Oracle-WTF.  And you know what - it pairs up with the "write stuff" article nicely.  I'll bet you that that original stored procedure was not planned (no specs), peer reviewed (no health rivalry), change managed as it was tweaked over time, and until now - never "fixed".  Can you imagine how long it took to reverse engineer that into a single SQL statement (I'd guess a minimum of an hour - and likely more).  An hour well spent, but I know personally the frustration of that person for that hour - cursing every developer that touched the code before them.

And I cannot tell you how much I hate code like this:

    BEGIN
v_temp_runs := 0;
SELECT COUNT(*)
INTO v_temp_runs
FROM dual
WHERE EXISTS
( SELECT *
FROM temp_runs );
END;

IF v_temp_runs > 0 THEN


 

Why count something and do something else if there was something to be counted?  JUST DO IT, if there is nothing there - SO WHAT?

At least the original code did not end in "when others then null;" - there is that.

Tuesday, April 15, 2008

UKOUG Call for papers...

The UKOUG call for papers has been announced.  This is an event I've personally attended every year now since 2003.  I've met and made some very good friends at this conference.  I strongly recommend it as a technical forum.  It is a large but not overwhelming large event - with lots of technical content (and a bit of fun as well).

I've written in the past about participating in the user community - this is the perfect chance.  Everyone has something useful to contribute - everyone does. 

So, get on the OTHER SIDE of the podium.  Yes, it can be scary, but it is definitely worth the fear factor long term.