Thursday, February 01, 2007

How to scale...

I received an email this morning from another Oracle employee and the gist was

"what percentage of the questions regarding scalability, reliability or performance you get are as a result of an 'elegantly' designed middle tier implementation that just took for granted a well tuned, well designed database"

Meaning, the presumption was made that the database will just scale and perform well without any serious thought. This question came about as a result of a discussion he was having with another internal employee who had (wisely) written (emphasis is mine)

I think some of the successes that I had in development projects as I started doing more with middle tier development always came from my early work and knowledge of the database and its capabilities. I found with Scada systems, billing systems, etc. that no matter how well your architect a system, if the database is not used effectively, then your systems will never scale or be reliable. Architects put so much time into creating what they perceive to be a perfectly architected middle tier (or SOA), without realizing that much of their success will come from the Database tier.

It has been my experience that too many of the “middle tier” experienced people now days have no knowledge of how to effectively use the DB, how to architect systems with a large middle tier component around a DB, and how to take advantage of DB features as part of a middle-tier architecture. Middle tier people have become DB illiterate.

They wanted to know what I thought about this concept. Most of you can already probably guess what my response was I am sure :) I wrote back simply:

all of them, 100%. I am not being sarcastic, just agreeing 100% with what was said. Chapter 1 of my books start with:

Developing Successful Oracle Applications

I spend the bulk of my time working with Oracle database software and, more to the point, with people who use this software. Over the last 18 years, I’ve worked on many projects—successful ones as well as complete failures—and if I were to encapsulate my experiences into a few broad statements, they would be

  • An application built around the database—dependent on the database—will succeed or fail based on how it uses the database. Additionally, in my experience, all applications are built around databases. I cannot think of a single useful application that does not store data persistently somewhere.
  • Applications come, applications go. The data, however, lives forever. In the long term, the goal is not about building applications; it really is about using the data underneath these applications.
  • A development team needs at its heart a core of database-savvy developers who are responsible for ensuring the database logic is sound and the system is built to perform from day one. Tuning after the fact (tuning after deployment) typically means you did not give serious thought to these concerns during development.

To me, the ability to scale (perform, be reliable) starts with and many time ends with the database implementation and how the application interacts with the database.

Period.

POST A COMMENT

41 Comments:

Blogger SeanMacGC said....

Amen!
(And welcome back)

Thu Feb 01, 09:04:00 AM EST  

Blogger Patrick Wolf said....

I couldn't agree more. It seems that most of the J2EE developers just take care of there middle-tier architecture and don't care about the database.

It's just a black box for them. Hibernate will do the magic for them... ;-)

I have to confess, I'm one of this PL/SQL dinosaur :-)

Patrick
Check out my APEX-blog: http://inside-apex.blogspot.com

Thu Feb 01, 09:38:00 AM EST  

Anonymous Anonymous said....

A chain is only as strong as it's weakest link. The average application in this day and age has many

Thu Feb 01, 10:12:00 AM EST  

Anonymous Toon said....

Amen one more over here.

If the application is a 'window-on-data' application (and most of them are...) then spend time on creating a good database design.

The Java/J2EE community tends to totally forget about this.
But it's not really their fault: the origin of the ignorance is seeded in the educational system.

Thu Feb 01, 10:24:00 AM EST  

Blogger yas said....

I have recently talked to someone from a well-known entertainment company. He told me they were getting rid of all Oracle databases and they had decided to become database independent. They will develop web applications without caring about the database and they will be able to switch to another database vendor whenever they want to.

Thu Feb 01, 10:47:00 AM EST  

Blogger Thomas Kyte said....

yas said...

do us a favor, when they have succeeded AND have switched between any two databases.....

Please come back and let us know :)

Thu Feb 01, 10:53:00 AM EST  

Anonymous Mark Brady said....

I feel like Edward Everett at Gettysburg. It's taken me 2 years to say what you said in 2 minutes.

I often add one additional phrase.

Every developer feels that they are fully qualified to design the database portion of an application, since it's little more than a parking lot to store bits when the application isn't using them.

Thu Feb 01, 11:01:00 AM EST  

Anonymous Bert said....

I would have worded that last bullet item as two:

- good database design determines fundamental performance "potential"

- competency of SQL programmers decides upon successful use of design

Because 20% or more of the databases I've seen fail the first test - I've even been an expert witness in court cases on this.

And because 60% or more of the SQL programmers I've met are total hacks. They don't know basic SQL or set language theory/use, let alone how to tune it.

So that's why so many application projects fail. It does not matter which database, which platform, which technology or how much money they spend.

Someone should start a betting line against application projects - sort of a hedge fund against database application success - because the odds would pay off :)

Thu Feb 01, 11:26:00 AM EST  

Anonymous Jeff said....

Anyone interested in learning how to measure the scalability of a database or even see how best to setup Oracle RAC for optimal performance should check out Quest's Benchmark Factory product here.

Thu Feb 01, 12:19:00 PM EST  

Anonymous MaartenV said....

Look at it from the developer viewpoint.
It is the fault of SQL. Since we left the codasyl-DB, we do not need access path analysis anymore. Because all the data is always retrievable and the database is always performant.
Ans if stuck, it must be a db-bug. ;-)

Thu Feb 01, 12:44:00 PM EST  

Blogger Thomas Kyte said....

... Look at it from the developer viewpoint ...

that is in no way shape or form a "developer viewpoint".

Proof:

a) i am a developer.
b) i don't see it that way.

The data is retrievable, but that says nothing about the performance.

Thu Feb 01, 01:05:00 PM EST  

Anonymous Anonymous said....

Preaching to the choir here, but you're dead on, Tom. Where I work there's a vanishingly small portion of the technology people who understand that applications exist to support data, not the other way around.

Thu Feb 01, 01:33:00 PM EST  

Blogger Patrick Wolf said....

Recently I read this interesting blog posting about .NET developers complaining about NVL and OUTER JOINS... It's always the fault of the database! :-)

Patrick

Thu Feb 01, 02:22:00 PM EST  

Anonymous Anonymous said....

So how about that posting on eBay architecture that mysteriously disappeared from asktom. You know, the one that discussed eBay architecture - all sorting, joins are done in the middle tier. The oragle database is used just as a data dump. no packages, stored procs, triggers, etc.

eBay scales pretty well, dont you think?

Thu Feb 01, 03:05:00 PM EST  

Blogger Thomas Kyte said....

you can scale in many ways.

do i believe the ebay approach is wise? nope. It could be accomplished in a much easier fashion.

Thu Feb 01, 03:09:00 PM EST  

Anonymous Anonymous said....

Hello, Tom! May I ask you a personal question here? Many great scientists believed in God. You are an Oracle Guru and scientist of some kind. What is your attention to God an faith?

Thu Feb 01, 04:11:00 PM EST  

Blogger Alberto Dell'Era said....

I think some of the successes that I had in development projects as I started doing more with middle tier development always came from my early work and knowledge of the database and its capabilities

Could be also that the mindset is transferrable to the middle tier as well, not only the knowledge about the database.

For any relational database, especially Oracle, is deeply grounded in logic and Mathematics (in fact sometimes I can't tell whether I'm using sqlplus or matlab), so working with it (appropriately) necessarily improves one's skills in logical reasoning, and analytical abilities.

Not to mention abstraction skills - what's a schema after all, if not the essence of a solution without the frills (or if you prefer, the mathematical model of an application data) ?

Thu Feb 01, 04:27:00 PM EST  

Anonymous Anonymous said....

I agree with Alberto. Programming to me is like a game but it isn't, but it is.

It's like proofs in math. Yeah, I could prove that this is equal to this or this is true, but wouldn't it be more efficient to do it in 3 steps instead of 12? It's not just what you begin with and what you end with but how you get there. That is why I always try and look for newer, better ways. Being a DBA, I still try and understan the programming side. Pipeline Functions, Collections, Arrays, Parallel statements, channels, different ways of analyzing statistics and on and on.

I like what you say Tom, "I learn something new about Oracle every day."

I try to as well.

P.S. It's great to have you back!

Thu Feb 01, 06:34:00 PM EST  

Anonymous Dinesh said....

I was one of those proponents many years ago and realized that with the niddle tier centric architecture achieving scalability and performance took a lot of effort. We need a cache to start with then had to move to concurrent caches. Then there were the indexing services for those caches. I soon realized where I was heading to (asembling a database in the middle tier) and chucked all that and moved lot of the processing to pl/sql as we were already invested in Oracle. That reduced the costs (in all aspects) significantly and gave us better performance and scalability.

Fri Feb 02, 12:00:00 AM EST  

Blogger Sidhu said....

Hi

Something really different from what is goin on here.

I live in New Delhi, India. Yesterday I went to get both the Tom Kyte books for one of my friend. I went to one shop and asked about "Effective Oracle by Design", that guy said No. I said there is another book by Tom Kyte on Oracle 9i & 10g. He said yes that is available by "O Reilly". I said ok show me then what he gave me was Tomcat: The Definitive Guide :)

Is there something gonna be like "Effective Tomcat by Tom Kyte" ;)

Sidhu

Fri Feb 02, 12:45:00 AM EST  

Blogger yas said....

do us a favor, when they have succeeded AND have switched between any two databases.....

Please come back and let us know :)


I think we all will not live that long :-)

Fri Feb 02, 07:35:00 AM EST  

Anonymous Gabe said....

Bert said …

And because 60% or more of the SQL programmers I've met are total hacks. They don't know basic SQL or set language theory/use, let alone how to tune it.


“SQL” requires three keystrokes. It doesn’t take much to have it dropped in a resume. Who doesn’t, really? Show me a Java/J2EE, VB/.NET developer/architect with no “SQL” on their list of acronyms!

Anyway, regarding the hacks, it is a demand and supply model, just inevitable really. I would venture to say the percentage of phony developer/DBA/designer/architect skills being hired is in the same ballpark as the percentage of phony IT management skills doing the hiring.

There are people in charge of delivering and managing IT projects/systems; they position themselves in these well-rewarded (money and glory) roles and they won’t move, willingly, anywhere but up. You get hacks in charge and the system will come out right only by accident, likely at a much higher cost (money and aggravation for the ones involved). It is truly about managing appearances; I see IT projects/shops being managed like fashion shows rather than railway systems.

Paperware? I have seen few of those; they might look wonderful on paper (tons of paper actually, the more the merrier) but there isn’t much behind it. And paper is just one type of communication. Nowadays IT management is obsessed with communication. So much so that there is an oversupply of excellent communicators, communicating [through as many sophisticated channels as technology gadgetry allows … iPhone anyone?], much of the same thing, mostly of questionable actual, actionable value. The act of communicating has become so important that, many times, overshadows the reason for communicating in the first place.

Developer vs. DBA, middle-tier vs. database, architect vs. developer? Show me who’s in charge, who controls the budget … if (s)he’s a hack then forget about all that. Kind of like a terminal illness, from that point on it is just about managing the threshold of pain.

Noailles’ ”I prefer honor to honors” only rarely applies. Most people will quickly advance to a position for which they aren’t capable or qualified. And hacking once just seems to wet the appetite … once that psychological barrier is left behind, hacking as far as one can get becomes the new goal. Who can resist temptation? Which brings me to that rather oblique inquiry regarding someone else’s faith [if that’s not a private matter then nothing is] … I wonder if praying does make systems more scalable? If it works, I might suggest it to few people.

Fri Feb 02, 01:30:00 PM EST  

Anonymous Deva said....

Tom, I may have to slightly disagree about using Databases - When I worked in UPS, I saw a variety of application in UPS.com, which never use a database, but instead using messaging(to/back from Mainframes) and flat files :-)

Fri Feb 02, 01:38:00 PM EST  

Blogger Thomas Kyte said....

Deva -

and what do you think your message store used to persist data?

a database. MQ Series for example, uses DB2 as it's persistent store.

AQ - uses Oracle as it's persistent store.

Or, did UPS just "lose" messages when systems failed (you need not answer, I know they didn't - because the message store they used was a database with recovery abilities - so when a message was queued - it was committed!)

Fri Feb 02, 01:44:00 PM EST  

Blogger EscVector said....

The real problem is that those who code are quite often not those who work 24/7 and have to meet SLAs. Most people I've met are not like Tom. They do what they have to do and little more. This attitude makes it easy to pass the buck. I work in a software company where the developers will say "we don't' have to worry about the database. That's Operations problem." I really believe it is getting worse.

Fri Feb 02, 04:20:00 PM EST  

Anonymous Anonymous said....

unfortunately I have to disagree with a few things...first, the application should not care about it's database! Meaning, I don't care if the database is in Oracle or MySql or SQL server. A database is a database and that's that. Sure there are some tunings that will need to occur but for the most part an application should be database agnostic. Therefore, the application is not "built" around the database. All you DBA's think you rule the world and frankly I've seen applications fail because the DBA could not get it through their mind that we don't want a database that's to the 3rd or 4th normal form!

I'm a software developer that has had way too many bad experiences with the DBA's thinking they are god and refusing to do anything unless it's there way. Guess what? I can write SQL and DDL too!

Fri Feb 02, 07:14:00 PM EST  

Anonymous Anonymous said....

I am a neither a programmer nor a DBA, but do work for a software company. If I typically look at the issue, here is how I view it

Programmer should no way be concerned with optimizing SQL statements. It is neither their responsibility nor their skill. Assuming 10s to 100s developers are working on different projects getting ready to use the same database, he/she does not need to know the complete architecture/layout of the database nor how to optimize their queries for the database. It is the role of DBA/Data Architect who understands (or should understand) the architecture/layout of the database. Programmer should be concerned with something like optimizing the memory usage, disk space usage, logging, performance for the application developed etc (of course, many of the programmers are leaving to the underneath layers for most of these functions). If developer needs data from database, he/she can either
give the requirement to DBA and DBA is responsible for handing over data in required format
or
get the SQL statements used in the code to be analyzed by DBA/Database Architect before even it is considered for deployment in to QA environments.

First option may be difficult to implement, but second should work with little co-ordination between developers and DBAs/Data Architects. If developers want to understand how to optimize SQL statements, he should move to DBA/Database Architect group.

Fri Feb 02, 08:29:00 PM EST  

Blogger Thomas Kyte said....

sorry - i'll be blunt to the last two anonymous posters.

bzzzt

wrong, foolishly wrong, absolutely wrong, stupidly wrong.

think about it guys/girls.

If you use data
If you save data
If your application counts on data

guess what just might matter....

yeah - data.

I'm a developer. I'm not a DBA. You are so far from the mark as to be - well - wrong.

Fri Feb 02, 10:56:00 PM EST  

Blogger Patrick Wolf said....

>Programmer should be concerned with
>something like optimizing the memory
>usage, [...] for the application
>developed etc [...] If developer
>needs data from database, he/she can
>either give the requirement to DBA
>and DBA is responsible for handing
>over data in required format

Is it really the role of a DBA to write SQL statements for an application developer? Maybe it's in the US, but not over here in Europe.

That's the job profile I see for a DBA in a development process:
-) Work with the software architect/data modeler to define the parameters for the physical storage (tablespaces, storage clauses, ...)
-) Optimize the data model for easy operation
-) Optimize individual SQL statements which have a performance problem

Profile of a programmer working for a data centric application project:
-) Has to know the data model of the application (or at least the parts he is working on)
-) Has to write the SQL statements to access this data model (that's HIS job!!!)
-) Has to write the business code on the application server/database server.

My 2 cents
Patrick
Check out my APEX blog

Mon Feb 05, 07:15:00 AM EST  

Anonymous Anonymous said....

Still remaining anonymous...

In the statement:
>Programmer should be concerned with something like optimizing the memory usage, [...] for the application developed etc [...] If developer needs data from database, he/she can either give the requirement to DBA and DBA is responsible for handing over data in required format

I did not exactly have DBA writing the SQL statements (though the statement clearly implies that), but may be sometime down the path, database might be a service where you specify the data and format, & you will get it. Internals of how the query is optimized is not necessarily known to developer. That is where the programmer may become independent of SQL statements. If DBA starts writing SQL statements for developers, I can imagine how it will go :). It might be little far to think at this time. My apologies for the statement.

Second option might work decent (in my observation) considering the current state of SQL statements from few (may be most) developers & DBAs sleepless nights with performance alerts...

1. Programmer writes the SQL statements in the code
2. sends to DBA/Data Architect for analysis (cost-based or what ever in DBA terms...I do not know)
3. Once approved, the code is ready to roll (of course making sure the othe other functions of code are designed ideally)

I view SQL optimization as DBAs expertise, not developers.

My one cent in this:)

Mon Feb 05, 01:22:00 PM EST  

Anonymous Moritz Fromwald said....

Great article!

I think the encapsualtion of DB Access should be pointed out, inside the DB (PL/SQL) as well as outside (Java, .NET,...).
In the end it is data we wanna display, and users don't care where it came from.

It is quite seducive to spread your SQL all over the code, but doing so, you loose your ability to optimize, tune or debug your DB access (together with your DBA)

see also
http://www.oracleplsqlprogramming.com/bestPractices.html


cheers Moritz

Mon Feb 05, 03:17:00 PM EST  

Anonymous Graham Bailey said....

Tom,
There's a typo in your last comment: "through" instead of "throw". If you could correct it I'd appreciate it. I may well be quoting you in a forthcoming meeting in my workplace regarding the use of effective SQL!

Mon Feb 05, 03:28:00 PM EST  

Blogger Thomas Kyte said....

I view SQL optimization as DBAs expertise, not developers.

than I shall be non-anonymous and very blunt.

You are part of the problem. Applications live and (frequently) die around the database. Your approach of "code it and throw it over the fence" horrifies me.

Mon Feb 05, 03:31:00 PM EST  

Blogger Bin said....

I agree with you on everything should be done in DB. However, can you point out a real life sample, which can compare with eBay's architecture?

Tue Feb 06, 01:11:00 AM EST  

Blogger Thomas Kyte said....

bin said

you mean like virtually every Oracle Applications install?

Tue Feb 06, 06:27:00 AM EST  

Blogger Vidya Balasubramanian said....

this was an issue I recently encountered - and I could'nt help but think why can't we let the database do what it is good at - keep it simple. Sequences were not used in the Application and developers coded their own sql generation object - the result everytime a db patch went in the seq gen object had to be reloaded.... it was not until a senior architect came in and had to force the developers to stop using the seq gen object

this was just a simple example - we know a whole lot of things can be done in the middle tier but keep it simple - let the db do what its good at

Tue Feb 06, 11:40:00 AM EST  

Blogger Bin said....

I'm after the architecture of a oracle application which has the similar size as ebay.

Tue Feb 06, 06:29:00 PM EST  

Blogger Thomas Kyte said....

and please define the size of ebay. Sort of nebulous. How about many telco applications - the ones that log every inbound and outbound call, or airline reservation systems, or the credit card systems that hit an Oracle database every time a card is swiped...

Tue Feb 06, 07:37:00 PM EST  

Anonymous Anonymous said....

I could not agree any more on the original topic


I started with my career in Clipper , Foxpro and later Power Builder .
As PowerBuilder was the darling of the developer's community with ease of use ,
I devoted much of my time and effort into PB .

Now , PB is no where in the market place ( except in few finanical - only in mainteannce mode ) .

To benefit from a career ( my ) prospective , one should focus on DB a lot , rather than going for a facelift ( GUI )


1990s - PB / VB / forms
2000s - Java /.NET
2010s - ?????

1990s - 20xxs ---DB will be there ...

Wed Feb 07, 11:04:00 AM EST  

Anonymous Anonymous said....

Hi,

what i think has happened in the last couple of years is that Hibernate has matured significantly and led to the rise of the Rich Domain Model as popularised by Domain Driven Design by Eric Evans.

Whether modeling a business domain is better using OO or relationaly is
irrelevent because you still have the impedance mismatch which is not addressed.

In the Hibernate authors defence, they know and acknowledge the importance of the database and relational theory. Unfortunately, developers in practice use Hibernate to reduce the database to an artificat (java-generated schemas!!!) and nothing more than a bit bucket. Proper, relational data modelling skills are become less visible (i'm lucky that my manager understands how important a relationally designed model is) and is on the decline.
I've just (lost) an argument with a developer when i questioned why we were bringing back lots of data for a simple drop down. The answer is that the drop down is a list of companies which has other references.

I predict alot of work for performance-orientated DBAs.

Wed Feb 07, 11:27:00 AM EST  

Blogger Johan said....

Tom

Thought you may be interested in this as a good example of how a system can go VERY wrong:

http://www.sundaytimes.co.za/News/Article.aspx?id=442059

This system is apparently based on an Oracle database (see: http://www.itweb.co.za/sections/business/2007/0704261031.asp?S=IT%20in%20Government&A=ITG&O=FRGN
)

I am almost certain that this comes down to design issues, more than anything else, given that they already have multiple application and database servers and the performance is still bad.

The problems of this system is actually preventing new and used car sales across the entire South Africa (since people can not register the vehicles) and is generally causing huge losses of productivity as people have to wait in queues for hours while it takes up to 20 minutes to complete a since transaction.

Since this is a Oracle based solution, you could maybe help them out?

Thu Apr 26, 09:37:00 AM EDT  

POST A COMMENT

<< Home