Tuesday, November 28, 2006

See, you just cannot make this stuff up...

Real email, received today from another Oracle person - asking me a question:

A developer wants to represent all data types as Varchar2(2000) so that they won't have to change sizes in the future. This includes Numbers and Dates. The DBA wants to know if there are any management, performance, or indexing implications.

Ouch (wonder why they stopped at 2000 with the varchar2?) That hurts doesn't it...

Here was my response:

 

They are suggesting one of the worst ideas known to human data processing people.  I cannot over state how BAD AN IDEA THIS IS ENTIRELY.

Funny thing - dates are 7 bytes, you cannot "under" or "over" size them (you never CHANGE their size), timestamps are fixed sizes as well - 7, 11, or 13 bytes depending on fractional seconds and timezone needs.

Funny thing part two: if someone defines a Number(2), we can "alter table t modify X number(3)" to increase the size.  But, if someone was to MAX OUT all numbers - we CANNOT shrink them!!!

Funny thing part three: same with varchar2 - we can make them bigger anytime we need - immediately, absolutely.  We cannot however fix the oversized varchar2 in the future when the column has data - we can GROW, we cannot SHRINK

Funny thing part four: I spend 20 minutes on this topic - using the right datatype - in all of my seminars, over and over and over again.
Here is a cut and paste from my book Effective Oracle by Design.

Use the Correct Datatype

Using the correct datatype seems like common sense, but virtually every system I look at does one of the following:

•    Uses a string to store dates or times
•    Uses a string to store numbers
•    Uses VARCHAR2(4000) to store all strings.
•    Uses CHAR(2000) to store all strings, wasting tons of space and forcing

the use of a lot of trim function calls

•    Puts text in a BLOB (raw) type

I have a very simple rule: Put dates in dates, numbers in numbers, and strings in strings. Never use a datatype to store something other than what it was designed for, and use the most specific type possible. Furthermore, only compare dates to dates, strings to strings, and numbers to numbers. When dates and numbers are stored in strings, or stored using inappropriate lengths, your system suffers:

•    You lose the edit upon insertion to the database, verifying that your dates are actual dates and numbers are valid numbers.
•    You lose performance.
•    You potentially increase storage needs.
•    You definitely decrease data integrity.

How many of you know what ORA-01722 or ORA-01858 errors are off the top of your head? I bet many of you do, because they are so prevalent in systems where numbers are stored in strings (ORA-01722: invalid number) and dates in strings (ORA-01858: a non-numeric character was found where a numeric was expected).

How Data Integrity Decreases

Using an incorrect datatype is wrong for many reasons, but the first and foremost is data integrity. Systems that use strings for dates or numbers will have some records with dates that are not valid and numbers that are not numbers. It is just the nature of the game here. If you permit any string in your date field, at some point, you will get dirty data in there.

Without data-integrity rules in place, the integrity of your data is questionable. I’ve needed to write the functions to convert strings to dates but return NULL when the date won’t convert. I’ve also needed to try one of five date formats to see if I can get the date to convert. Can you look at 01/02/03 and tell what date that is? Is that yy/mm/dd, dd/mm/yy, or something else?

How Performance Suffers

Beyond the obvious data-integrity issues associated with incorrect datatypes, there are other subtle issues. To demonstrate, we’ll use an example of a table with two date columns. One will be stored in a string using YYYYMMDD and the other as a DATE type. We will index these values and analyze the tables completely.

ops$tkyte@ORA920> create table t
  2  as
  3  select to_char( to_date('01-jan-1995','dd-mon-yyyy')+rownum, 'yyyymmdd' ) str_date,
  4         to_date('01-jan-1995','dd-mon-yyyy')+rownum date_date
  5    from all_objects
  6  /
Table created.

ops$tkyte@ORA920> create index t_str_date_idx on t(str_date);
Index created.


ops$tkyte@ORA920> create index t_date_date_idx on t(date_date);
Index created.

ops$tkyte@ORA920> exec dbms_stats.gather_table_stats( user, 'T', method_opt=> 'for all indexed columns size 254', cascade=> true );

Statement processed.


Now, let’s see what happens when we query this table using the string date column and the real date column. Pay close attention to the Cost and Card= component of the plan:


ops$tkyte@ORA920> set autotrace on explain
ops$tkyte@ORA920> select * from t
  2 where str_date between '20001231' and '20010101';
STR_DATE DATE_DATE
-------- ---------
20001231 31-DEC-00
20010101 01-JAN-01
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=5 Card=406 Bytes=6090)
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'T' (Cost=5 Card=406 Bytes=6090)
   2    1     INDEX (RANGE SCAN) OF 'T_STR_DATE_IDX' (NON-UNIQUE)
              (Cost=3 Card=406)

ops$tkyte@ORA920> select * from t where date_date between to_date('20001231','yyyymmdd') and to_date('20010101','yyyymmdd');
STR_DATE DATE_DATE
-------- ---------
20001231 31-DEC-00
20010101 01-JAN-01
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=3 Card=1 Bytes=15)
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'T' (Cost=3 Card=1 Bytes=15)
   2    1     INDEX (RANGE SCAN) OF 'T_DATE_DATE_IDX' (NON-UNIQUE)
              (Cost=2 Card=1)

So, what happened there? Well, the optimizer understands VARCHAR2 types and it understands DATE types. The optimizer knows that between the two DATE items December 31, 2000, and January 1, 2001, there is only one day. The optimizer also thinks that between the two string items ‘20001231’ and ‘20010101’ there are a whole bunch of values. The cardinality is thrown off.

But, so what? What do we care if the cardinality is wrong? It won’t affect our output—the answer. That is correct, but it could have some impact on our overall performance. Consider a different query against the same data, asking for effectively the same result set:

ops$tkyte@ORA920> select * from t
  2 where str_date between '20001231' and '20060101';
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=12 Card=2034 Bytes=30510)
   1    0   TABLE ACCESS (FULL) OF 'T' (Cost=12 Card=2034 Bytes=30510)
ops$tkyte@ORA920> select * from t where date_date between to_date('20001231','yyyymmdd') and to_date('20060101','yyyymmdd');
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=12 Card=1823 Bytes=27345)
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'T' (Cost=12 Card=1823 Bytes=27345)
   2    1     INDEX (RANGE SCAN) OF 'T_DATE_DATE_IDX'
              (NON-UNIQUE) (Cost=6 Card=1823)

Note:    As discussed in Chapter 6, different database parameter settings will influence the cost of various operations. You may need to increase the range of dates to see the same effect as shown in this example, but it will happen at some threshold.

This time, the fact that we hid a date in a string has a serious side effect. Our query plan has changed. We are now full-scanning for the string date but index-range scanning for the DATE type date. So, besides the fact that there is nothing stopping someone from inserting 20009950 as a date value into our field, the use of a string has withheld valuable information from the database. We lose all around.

How You Might Increase Your Storage Requirements

In addition to using the proper base datatype such as number, date, or string, you should also use the most specific type you can. For example, use VARCHAR2(30) for a field that is up to 30 characters in length; do not use VARCHAR2(4000).

Ask Tom
“I work with a modelers group. My modeler would like to define every VARCHAR2 field with
the maximum length, which means that a table with 20 VARCHAR2 fields will all be defined with a maximum of 2000 or 4000 bytes. I tried to talk to him about the reasons we identify data with correct lengths and names in order to understand what we have in our database. He told me that it doesn’t matter, since Oracle just stores the length, etc., and there is no overhead. I don’t believe this is true, but have been jumping between so many databases that I cannot find a document on the internals of Oracle storage. Can you help me out here with this question?”
My gut response was, “This is your data modeler, my goodness!” They are the ones who are supposed to be telling you that it is vital to use the appropriate length when defining fields! That is their job. Let’s just forget about things like storage for a minute, why don’t we ask him:
•    What is going to happen when users pull this up in a query tool that formats each field based on the width of the column in the database? They’ll see one column and need to scroll way over to see the second, the third, and so on.
•    Say the code prepares a query that selects ten columns that are VARCHAR2. The developers, for performance, would like to array fetch (very important). They would like to array fetch say 100 rows (very typical). So, you have 4,000  10  100 = almost 4MB of RAM the developers must allocate! Now, consider if that were ten VARCHAR2(80) fields (it’s probably much smaller than that). That’s about 78KB. Ask the data modeler how much RAM he is willing to kick in for this system.
•    Now, the developers start to build a data-entry screen to put data into the database. Wow, that code field can be 4,000 characters long and that first name can be 4,000 characters long. How the heck is anyone going to know what sort of data can really go in there?
Tell your data modeler group members that they need to consider the length as a constraint. Just as they use primary and foreign keys, they should use the proper and correct length on fields. You can always expand a field via a command like alter table t modify c varchar2(bigger_number). There is no reason to use the maximum length everywhere. It will hurt the applications you develop, because they will mistakenly allocate many megabytes of RAM. Just think of the array fetch example with an application server. Now, it’s not just 4MB; it’s 4MB  number of connections. You are talking some real memory here for a single query, and you’ll be doing a lot of them at the same time.
End Ask Tom

A CHAR(2000) will consume 2,000 bytes of storage whether you put in the letter a, the string ‘hello world’, or 2,000 characters. A CHAR is always blank-padded. Additionally, are you thinking about using an occasional index in your system? If so, beware of storage-related problems. Consider the following:

tkyte@ORA817.US.ORACLE.COM> create table t ( a varchar2(4000), b varchar2(4000));
Table created.
tkyte@ORA817.US.ORACLE.COM> create index t_idx on t(a);
create index t_idx on t(a)
                      *
ERROR at line 1:
ORA-01450: maximum key length (3218) exceeded

NOTE: In Oracle9i, the maximum key length is larger – but the restriction still exists.  For example, an index on T(a,b) would raise: ORA-01450: maximum key length (6398) exceeded in Oracle9i.

My system has an 8KB block size. I would need to use at least a 16KB block size to index a single column, but even then, if I tried to create a concatenated index on T(A,B), it would fail there!

The same holds true for your numbers and the new Oracle9i TIMESTAMP datatypes: When appropriate, use scales and precisions on those fields in order to better define your data integrity and to give applications that much more information about the data itself.


In short, never be tempted to use anything other than a DATE or TIMESTAMP datatype to hold a date value, and never be tempted to use a VARCHAR2 to hold a number. Use the appropriate and correct type for each to ensure maximum performance, and more important, to protect your data integrity.

POST A COMMENT

31 Comments:

Blogger Noons said....

"store everything in varchar2(2000)":

typical demented, incompetent design "rule" of the OO/j2ee brigade.

yeah, we know: they can do all the data type validation/processing in the app server. yadda,yadda.

Directed at the morons:

Anyone can re-invent every system out there from the word go, right from the ALU to the IO controller and end up with a working solution.

Is it worth it? Of course not.
Why?
Because they'd be INCREASING the total cost of development when they ignored previous IT science and development. And ignored what is already there, ready to be used, de-bugged, tuned and optimised ten orders of magnitude better than what they'd be able to achieve in a typically costed project.

In simple words that hopefuly even these ignorants can understand:

Re-inventing wheels COSTS MONEY!

Here is a clue for them: to try and centralize the datatype validation into the app server(s) is inherently non-scalable and impossible to maintain without a serious impact on availability of the application(s).

Yes, app servers MAY have to run more than one application: ICT is a little bit more than just a shopping cart!

Not to mention that database software was designed, tuned AND debugged to safely and reliably support multiple data types, orders of magnitude more efficiently than any o-j2ee-moron will ever be able to code in any usable time.

Wouldn't it be dandy if for once these idiots tried to USE software that is already there instead of re-inventing the wheel for every project they get involved in?

Narh, that would be applying intelligence. And professionalism. Two things they are totally devoid of.

Tue Nov 28, 10:33:00 PM EST  

Anonymous Robert said....

>>
A developer wants to represent all data types as Varchar2(2000)


hmmm if that email sender is reading this (which I doubt) - I'd love to know...

How the x$#*@ HELL is "a developer" allowed to even think up such a boneheaded idea much less "wants to" do it ????????

Tue Nov 28, 11:14:00 PM EST  

Anonymous Dave, England said....

Tom, Any chance you could paste the rest of your book - I'm a bit strapped for cash at the moment.

Wed Nov 29, 04:39:00 AM EST  

Anonymous Anonymous said....

"...
but virtually every system I look at does one of the following:

• Uses a string to store dates or times
"""

for example, Oracle does that:

SQL> describe v$sql
Name Null? Typ
----------------------------------------- -------- ----------------------------
SQL_TEXT VARCHAR2(1000)
...
FIRST_LOAD_TIME VARCHAR2(57)
...

Wed Nov 29, 08:34:00 AM EST  

Blogger SeanMacGC said....

> for example, Oracle does that:

> SQL> describe v$sql

V$SQL is a view, so whilst it may be transforming a DATE, it is not storing a DATE as a string.

Wed Nov 29, 09:29:00 AM EST  

Anonymous Anonymous said....

Just curious if the developer is working on XML...

Wed Nov 29, 09:45:00 AM EST  

Anonymous Mark said....

The DBA wants to know if there are any management, performance, or indexing implications.

I think this is the scariest part. Why doesn't the DBA know? :|

Wed Nov 29, 10:04:00 AM EST  

Blogger Robert said....

>>I think this is the scariest part. Why doesn't the DBA know? :|

heck maybe "a developer" and "the DBA"
is the same person ???

Wed Nov 29, 10:11:00 AM EST  

Blogger Dougie McGibbon said....

But using varchars for dates is so useful! I mean, how else can you handle it when your yearly accounts are all closed off and balanced - and then you find a whole other set of transactions that need to be closed off before the start of the next financial year - but you can't open the last year...

So you invent a new date and have a calendar which has June 31st which is where you can dump all your excess transactions.

(Seriously - my last company did have a 31st June on their calendar!)

Wed Nov 29, 02:30:00 PM EST  

Anonymous Michael Olin said....

So you invent a new date and have a calendar which has June 31st which is where you can dump all your excess transactions.

Brilliant! Having the SA waiting around to set back the system clock a few hours so that you can make sure all of the transactions are timestamped before the fiscal year ends at midnight (yes, I was stuck watching the system churn that night) seems so stupid when you realize that by storing dates as characters you can just create another day! Now all the months can have 31 days (unless they need more).

Wed Nov 29, 05:01:00 PM EST  

Blogger SeanMacGC said....

>typical demented, incompetent design "rule" of the OO/j2ee brigade

If such a mash-brained approach to data modelling (type storage mangling/cludging), is typical of those (approaches) as propagated by 'exponents' of Object-Orientation (OO) abroad in the general field, then this betrays, on their part, a grossly fundamental misunderstanding of what OO is all about, and misrepresents, in the extreme, core principles of that same Object-Orientation. They have it so wrong!

Wed Nov 29, 05:26:00 PM EST  

Blogger Alberto Dell'Era said....

core principles of that same Object-Orientation

Actually - core principles of any programming paradigm or language - typed variables, any programming language beyond assembly has them ;)

Wed Nov 29, 05:50:00 PM EST  

Anonymous Anonymous said....

I'm posting anonymous on this one for a reason.

We have an app with a huge column. They store all these application flags in one field. We have a UAT environment and a Prod. They do things in UAT and then we move it to prod. Well, the column is a LONG! Do you know what I have to do to get a Long moved over and then to do anything with it? It's a nightmare. I'm at least trying to get them to switch it to CLOB. I was not involved in this project, I sort of inherited it after the fact. It's a mess. I'm slowly but surely getting them to seriously consider rearchitecting this thing. It'sa mess.

Wed Nov 29, 10:35:00 PM EST  

Blogger Q u a d r o said....

noons

"Anyone can re-invent every system out there from the word go, right from the ALU to the IO controller and end up with a working solution."

Uh-huh, that reminds me some things i saw recently.

One of them was the system hugely suffering from CBC latches due to excessive LIO (poorly designed data model, queries and application algorithms did that all). Instead of realizing how bad they developed their system from Oracle perspective developers decided to write it's OWN CACHE SERVER - that's all, period.

Thu Nov 30, 12:32:00 AM EST  

Anonymous Bo Brunsgaard said....

Something bothers me here (apart from the idea of storing everything as varchars).

"demented, incompetent design "rule" of the OO/j2ee brigade", "morons", "ignorants", "idiots", "boneheaded idea", "mash-brained approach".

Aren't we all in this together? Developers AND DBAs? Or is it a question of "us and them"?

From our point of view, the developers have come up with a really bad idea (and I happen to agree - this specific case is NOT a good approach - ok?). But do we take the time to discuss it with them? Do we make the effort to try to understand WHY they have come up with this idea? Do we make the effort to explain WHY we consider it such a bad idea? Does it EVER cross our minds that sometimes they have valid points and valid concerns that are different from ours?

Or do we just tell them how immensely stupid they are and send them back to the damp dark dungeon where they bloody well belong?

A while ago I had the opportunity to go "undercover" with some developers to a meeting with a groups of DBAs. That was really a learning experience, and not only a positive one!

I may be terribly naive, but I don't believe that developers are stupid people who are only put here on Earth to abuse the database and be a pain in the neck to the DBAs. They are intelligent professionals, specialists in their field, with different viewpoints, trying to fullfill different requirements.

Developers may not know everything that we as DBAs do, but do we know everything that they know? Honestly?

The developers have their pieces of the puzzle, the DBAs have theirs. Standing in each our corner, hurling abuse at each other, rather than intelligently trying to help each other, isn't going to get us anywhere.

Bo

Thu Nov 30, 03:51:00 AM EST  

Blogger SeanMacGC said....

In response to Bo:

I agree, it’s all too easy to retreat into our respective, mutually hostile, polarised developer and DBA camps. I am, however, like many, both DBA and developer and developers and DBAs are, after all, two sides of the same coin: developers manipulate and present data, DBAs facilitate that aim (in every respect).

The problem arises, I feel, when one side attempts to peripheralise and diminish the responsibilities of the other, i.e., overloading the middle tier with complex business logic that properly resides next to, and inextricably linked with, the data; usually involving some new-fangled ‘technology’ or ‘paradigm’, which is generally a perversion, distortion and mispresentation of a pure concept. (Object-orientation (OO) is a case in point. OO is completely orthogonal to the Relational Model, and demands no perversion, compromise or bending to accommodate.)

Unfortunately, all too often this originates on the development side (there’s much less scope on the DBA side) and the instinctive reaction from the DBAs is to retreat. It is, however, not the correct reaction, which is to face up to and challenge the fallacies being propagated and perpetuated as fact, though vested interests usually make that a lot easier said than done.

Thu Nov 30, 06:02:00 AM EST  

Blogger gandolf989 said....

I have worked on apps where they don't use foreign keys, because they might cause overhead, and I currently work on a system that stores a dstinct name column with is a bunch of column concatenated together and not even distinct or always tracable back to where it came from. Since this column is used in the history table, it make it impossible to track changes from history.

Thu Nov 30, 10:45:00 AM EST  

Anonymous Anonymous said....

What happens when you drive too much off data.

Free not cheap enough.

Thu Nov 30, 12:54:00 PM EST  

Blogger Noons said....

"Aren't we all in this together? Developers AND DBAs? Or is it a question of "us and them""

No we are not. And yes, it is VERY MUCH a question of us and them: I for one do NOT want to have my chosen professional field's reputation spoiled by ignorants of this calibre.

Note: the notion that ALL developers are by definition this stupid is fundamentally wrong. I said "oo/j2ee" brigade. That is a very specific type of developer and by no means whatsoever the wholeness of developer-verse!

Note also that I bundled oo/j2ee together. There is a difference.

This all started when the oo/j2ee brigade decided to completely and utterly ignore prior knowledge and go ahead reinventing every wheel they could find, every single step of the way. Throw in reinventing boiled water for good effect as well.

I'm sorry, but after 6 years of putting up with this sort of wanton, irresponsible attitude to established design science for the sake of "kewlness", I'm done with political correctness and showing moderation.

Now I call it as I see it. And its most definitely not just me.

These idiots have been giving a bad name to IT and blaming EVERYONE else, not just databases, when projects fall apart due to their complete incompetence.

God forbid the reason ever being the obvious shortcomings of their ignorant, incompetent development and design!

They are DIRECTLY responsible for the largest debacles in IT in Australia, costing the companies and/or state organisations involved incredibly HUGE sums of money. Which ultimately end up being paid by us: the public.

About time we start calling them what they really are. The word is:

con-artists

Thu Nov 30, 11:04:00 PM EST  

Anonymous Anonymous said....

As a probable member of the 'j2ee brigade' I have to take exception with noons's gratuitous generalisation of java application development. Bad j2ee developers use the correct data types, good ones don't. You shouldn't generalise - you wouldn't like all DBAs to be generalised as conservative tyrants who won't use a DB feature unless it's been available since Oracle 7, would you?

Sat Dec 02, 04:10:00 PM EST  

Anonymous Anonymous said....

[["store everything in varchar2(2000)":]]

[[typical demented, incompetent ....]]

Hmm...why not stop down from the high horse and at least give this guy some credit for creativity?

Sure, we know this is a bad approach, but maybe he's new to database design (despite being someone's data modeler). Doesn't he at least deserve a chance to say what he thinks and find out the problems with the approach without being treated like a loser?

Thu Dec 07, 10:02:00 AM EST  

Blogger Uwe Küchler said....

Tom,

thank you for this nice "rant of the day" which reinforces me in discussions with developers I have every other day.

I'd like to point out another factor that - sooner or later - more than one application will be developed for this database and mistyped data fields just WILL be misused. When arguing over constraints, some of the "architects" state something like "oh, it's all written in the concept, so we may use varchar2(4000), but the developer knows from the concept that there will only be (e.g.) 100 characters in that field". Argh. So what about the new developers designing a new application for the old data model years later? Will they probably care for your app-specific documents? Modellers, please think about the consequences of your actions in the future!

Regards,
--==/ Uwe \==--

Fri Dec 08, 05:39:00 AM EST  

Anonymous Anonymous said....

With respect to
"Consider the following:

tkyte@ORA817.US.ORACLE.COM> create table t ( a varchar2(4000), b varchar2(4000));
Table created.
tkyte@ORA817.US.ORACLE.COM> create index t_idx on t(a);
create index t_idx on t(a)
*
ERROR at line 1:
ORA-01450: maximum key length (3218) exceeded

NOTE: In Oracle9i, the maximum key length is larger – but the restriction still exists. For example, an index on T(a,b) would raise: ORA-01450: maximum key length (6398) exceeded in Oracle9i. "

This is pretty clearly a bug....that an RDBMS cannot index a perfectly legal column.

The issue there isn't the use of varchar2(4000), but rather Oracle's somewhat schizoid treatement of these fields.

The basic issue excludes the various CHAR examples (as that's obviously rather demented allocation of potentially required space)... The main problem with creating strings 'to size' is the binding of low-level storage to interface characteristics. This leads to all kinds of problems related to attribute mapping (i.e. the same field is varchar(80) one place, varchar(57) another and so forth), introducing exceptions and limitations where these need never exist. The ideal would be a datatype somewhat like NUMBER for strings, i.e. a VARCHAR2(*) type without precision at definition time.

From the comments it's pretty clear the "I'm smart too!" crowd is good at singing in 'the choir', but hasn't really thought this one through.

Fri Dec 22, 01:55:00 PM EST  

Blogger Thomas Kyte said....

perhaps you need to look further...

you can index big strings, but a conventional b*tree index pretty much "doesn't make entire sense"

however a text index would

and for that you use a CLOB which is precisely what your varchar2(*) is.

varchar2(30) - that is an integrity constraint.

Mon Dec 25, 06:34:00 PM EST  

Anonymous Anonymous said....

Very good article. This article makes some interesting points. You made a good point, reading your article is a kind of enjoyment. Thank you.
Tactical Flashlights
r c helicopter
video game
Tactical Flashlight
Flashlight
Led Flashlight
rc helicopter
helicopter
airplane
Playstation 2
Playstation 3
Nintendo DS
SONY PSP

Mon Jun 23, 03:24:00 AM EDT  

Blogger Donald Soft Tech said....

To be sure that your shopping cart doesn't make your customers annoyed, ask some friends to place an order while you're watching. Tell them what to order, but don't tell them where to find it or how to place the order and complete it. Check out how long it takes them to find things and complete the steps involved. If they seem confused or have to ask questions about what to do at any stage in the order process, you should consider making changes in the way your shopping cart functions or changing the shopping cart providers. http://www.infyecommercesolution.com/

Mon Dec 01, 03:45:00 AM EST  

Anonymous Anonymous said....

Agree varchar2(4000) for every string is a bit of a cop-out in terms of modeling to the actual values. But from a DBA or storage perspective, is there a harm? Doesn't Oracle auto-expand to only what it needs -- it doesn't literally use 4000 for each one (like char) does it?

Wed Dec 15, 08:31:00 PM EST  

Blogger Thomas Kyte said....

@anonymous

But from a DBA or storage perspective, is there a harm

is there harm from the DBA perspective? yes, because they are there to protect the integrity of the data and they have just ensured that it will not be protected.

The data type is an integrity constraint, plain and simple. No constraints, no data integrity.

And please don't tell me "the application will take care of it". I've heard that myth enough in my life already....

Thu Dec 16, 02:42:00 AM EST  

Anonymous Anonymous said....

So, varchar2(4000) vs varchar2(40) is not a storage or space concern -- more just a way to protect data quality by enforcing the rule that certain columns should only be so wide?

Thu Dec 16, 02:55:00 PM EST  

Blogger Thomas Kyte said....

@anonymous

I'd like you to think about how it could be a serious client side memory concern.

think about 10 columns which should average 40 characters in MAX length but average 4000 characters in MAX length.

think about a client api that fetches 100 rows at a time (you know, a framework or something written up by your developers). How might they figure out how big of a fetch buffer to allocate? (probably DESCRIBE on the query - 100x10x40 vs 100x10x4000)

think about a session with a couple open cursors

in a connection pool with a few dozen connections


do the math - it can be painful for 'storage' reasons too.

Thu Dec 16, 02:58:00 PM EST  

Blogger Thomas Kyte said....

@anonymous

I'd like you to think about how it could be a serious client side memory concern.

think about 10 columns which should average 40 characters in MAX length but average 4000 characters in MAX length.

think about a client api that fetches 100 rows at a time (you know, a framework or something written up by your developers). How might they figure out how big of a fetch buffer to allocate? (probably DESCRIBE on the query - 100x10x40 vs 100x10x4000)

think about a session with a couple open cursors

in a connection pool with a few dozen connections


do the math - it can be painful for 'storage' reasons too.

Thu Dec 16, 02:58:00 PM EST  

POST A COMMENT

<< Home