Order in the court!
As I was proofing my chapters yesterday, I couldn't help but notice that in three chapters I said the same thing, stronger and stronger each time. It was about using (or not using more specifically) ORDER BY. I see it time and time and time again.Someone assumes that because they observe the data coming out sorted from a query or that a query is using an index - they can and should leave off the ORDER BY. That is a career limiting move to borrow a phrase from Connor McDonald (he used it in reference to reorganizing your database but it applies totally here). It is a time bomb ticking away in your code.
Did you know simple optimizer change - no more sort. Data could appear to be sorted today, statistics gathered and not sorted tomorrow.
Did you know that reading data out via an index - does not imply data is sorted by the index key!
That a group by does not sort (necessarily, and even if it does, it is a binary sort – which is different than a character sort for those of us outside of US7ASCII land). Unfortunately, part of that thread is missing for some reason (I remember it well, I loved the comment: “I have frequently thought that such 'preaching' is simply an indication that the person answering isn't willing to admit that they don't know how to do it” said right before they tried to show someone how to sort using GROUP BY. Pot, kettle, and black all popped into my head at that).
Anyway, the bottom line – you want data sorted? You better use ORDER BY because you cannot have any expectations on the sorted nature of the data otherwise!
In chapter 10 on tables, I wrote:
You should think of a heap organized table as a big unordered collection of rows. These rows will come out in a seemingly random order, and depending on other options being used (parallel query, different optimizer modes and so on), they may come out in a different order with the same query. Do not ever count on the order of rows from a query unless you have an ORDER BY statement on your query!
In chapter 11 on indexes, I wrote:
Note Do not be tempted to ever leave an ORDER BY off a query. Just because your query plan includes an index does not mean the data will be returned in "some order." The only way to retrieve data from the database in some sorted order is to include an ORDER BY on your query. There is no substitute for ORDER BY.
And finally, in the chapter on partitioning, I did it again, a tad stronger.
This example brought to mind an unrelated but very important fact. When looking at hash partitioned indexes, we are faced with another case where the use of a index to retrieve data would not automatically retrieve the data sorted. Many people assume that if the query plan shows an index is used to retrieve the data, the data will be retrieved sorted. This has never been true. The only way we can retrieve data in any sort of sorted order is to use ORDER BY on the query. If your query does not contain an ORDER BY statement, you cannot make any assumptions about the sorted order of the data.
A quick example demonstrates this. We create a small table as a copy of ALL_USERS and create a hash partitioned index with four partitions on the USER_ID column:
SQL> create table t
3 select *
4 from all_users
SQL> create index t_idx
2 on t(user_id)
4 partition by hash(user_id)
5 partitions 4
Now, we will query that table and use a hint to have Oracle use the index. Notice the ordering (actually, the lack of ordering) of the data:
SQL> set autotrace on explain
SQL> select /*+ index( t t_idx ) */ user_id
2 from t
3 where user_id > 0
38 rows selected.
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=4 Card=38 Bytes=494)
1 0 PARTITION HASH (ALL) (Cost=4 Card=38 Bytes=494)
2 1 INDEX (RANGE SCAN) OF 'T_IDX' (INDEX) (Cost=4 Card=38 Bytes=494)
SQL> set autotrace off
So, even though Oracle used the index in a range scan, the data is obviously not sorted. In fact, you might observe a pattern in this data. There are "four sorted" results here; the … replaces values that were increasing in value; and between the rows with USER_ID = 34 and 81, the values were increasing in the output. Then the row with USER_ID = 19 appeared. What we are observing is Oracle returning "sorted data" from each of the four hash partitions, one after the other.
This is just a warning that unless your query has an ORDER BY, you have no reason to anticipate the data being returned to you in any kind of sorted order whatsoever. (And no, GROUP BY doesn't have to sort either! There is no substitute for ORDER BY.)