Friday, March 23, 2007

Stringing them up...

Suppose someone asks you:

I need to take the results of a query and pivot a value. That is, I would like output from the EMP table to look like this:

DEPTNO  ENAMES
------- ------------
10      clark king miller
20      adams ford …

can this be done in just SQL?

With the addition of analytic functions in Oracle 8i Release 2 and the SYS_CONNECT_BY_PATH() function in Oracle 9i Release 1 – this became something we can in fact do rather easily in SQL. The approach we will take is:

  1. Partition the data by DEPTNO and within each DEPTNO sort the data by ENAME and assign a sequential number using the ROW_NUMBER() analytic function
  2. So, we end up with a record eventually that is the result of connecting 1 to 2 to 3 to 4 and so on for each DEPTNO
  3.  The SYS_CONNECT_BY_PATH() function will return the list of ENAMES concatenated together for us.

The query would look like this:

SQL> select deptno,
2 max(sys_connect_by_path(ename, ' ' )) scbp
3 from (select deptno, ename,
row_number() over
(partition by deptno order by ename) rn
4 from emp
5 )
6 start with rn = 1
7 connect by prior rn = rn-1 and prior deptno = deptno
8 group by deptno
9 order by deptno
10 /

DEPTNO SCBP
---------- ----------------------------------------
10 CLARK KING MILLER
20 ADAMS FORD JONES SCOTT SMITH
30 ALLEN BLAKE JAMES MARTIN TURNER WARD

 


I used to use STRAGG for this (prior to sys_connect_by_path and analytics) - but now find this approach preferable.

POST A COMMENT

46 Comments:

Blogger Laurent Schneider said....

why? I like your stragg

Fri Mar 23, 11:13:00 AM EDT  

Blogger Thomas Kyte said....

why? I like your stragg

no need to create the type and type body to implement the user defined aggregate.

This is "cleaner"

Fri Mar 23, 11:17:00 AM EDT  

Anonymous Sokrates said....

indeed, that looks nicer than stragg

is

select deptno, scbp from
(
select deptno, sys_connect_by_path(ename, ' ' ) scbp, co, rn
from (
select deptno, ename,
row_number() over (partition by deptno order by ename) rn,
count(*) over (partition by deptno) co
from scott.emp
)
start with rn=1
connect by prior rn = rn-1 and prior deptno = deptno
)
where co=rn
order by deptno

an improvement or a worsening ?

Fri Mar 23, 11:37:00 AM EDT  

Blogger Laurent Schneider said....

Firstly, I find that it is more difficult to understand what the code is supposed to do, and it is much more easy (for most of us) to put an error in the connect by as by reusing YOUR stragg function.

Secondly, I think connect by will not scale well

A little test (well, it is just a test, it is not supposed to prove anything)


set timi on lin 9999 trims on newp none pages 0 termout off

spool /tmp/stragg.txt
select trunc(object_id/100),stragg(object_name)
from all_objects
group by trunc(object_id/100)
-- having stragg(object_name)='XYZ';
spool off

spool /tmp/connect.txt
select o,
max(sys_connect_by_path(object_name, ' ' )) scbp
from (select trunc(object_id/100) o, object_name,
row_number() over
(partition by trunc(object_id/100) order by object_name) rn
from all_objects
)
start with rn = 1
connect by prior rn = rn-1 and prior o = o
group by o
-- having max(sys_connect_by_path(ename, ' ' ))='XYZ'
/
spool off



the first query finished in 5 secs. After 3 min, the second one ended with ORA-30004: when using SYS_CONNECT_BY_PATH function, cannot have seperator as part of column value (why?)

If I replace all_objects by user_objects the first try takes 0.03 sec and the second one takes 0.35.

So when used with such a complex view as ALL_OBJECTS, connect by artifice is not very attractive to me.

Fri Mar 23, 11:48:00 AM EDT  

Anonymous Anonymous said....

I hope you write a book on complex sql u sing oracle features such as the model clause(very little documentation on this) and analytic functions.

these are often difficult to grasp for complex queries.

Fri Mar 23, 01:02:00 PM EDT  

Anonymous Milo said....

Yikes. ORA-30004 seems like enough of a reason to stay clear of this. Although you could ofcourse "fix" it with TRANSLATE(sys_connect_by_path(ename, CHR(1)), CHR(1), ' ').

The performance problems would definitely push me over the edge and stick with stragg, though. If I ever had a need for either...

Fri Mar 23, 04:02:00 PM EDT  

Anonymous Anne said....

I get the same ORA-30004 using the analytics approach if any of the strings in which I'm concatenating contain the character that is the same as the delimiter (space in this case). So, if I have a table called "my table" and the delimiter is ' ', then the ORA-30004 occurs. Try using sys_connect_by_path(object_name, ',' ).

Fri Mar 23, 05:22:00 PM EDT  

Anonymous neil on wheels aka dbms_cycle said....

How about a way to pivot but include duplicates? Picture a 1:M relationship, where the M may not necessarily be unique. Can that be done using sys_connect_by_path?

Fri Mar 23, 05:53:00 PM EDT  

Blogger SnippetyJoe said....

Funny this topic should come up. I just recently wrote a whole bunch of tutorials on a lot of the things discussed here.

Neil on wheels, I've got examples that show how to include or exclude duplicates with the sys_connect_by_path approach at
SQL Snippets: Delimited Strings - Rows to String - Hierarchical Method.

Those of your interested in comparing the sys_connect_by_path, STRAGG, and other approaches may want to check out the feature comparison chart at SQL Snippets: Delimited Strings - Rows to String and the related tutorials whose links are listed in that page.

For the anonymous user who asked about the MODEL clause, check out SQL Snippets: Delimited Strings - Rows to String - MODEL Method and SQL Snippets: SQL Features Tutorials - MODEL Clause.

Finally, there's a performance comparison chart of all the methods covered in the Delimited String tutorials at SQL Snippets: Delimited Strings - Rows to String - Performance Comparison Charts. The sys_connect_by_path approach turned out to use way more latches than the STRAGG approach so Laurent might be right about sys_connect_by_path not scaling as well as STRAGG.

Hope you find the info. useful.

Sat Mar 24, 12:12:00 AM EDT  

Blogger SnippetyJoe said....

BTW, I failed to make something clear in my last comment. The sys_connect_by_path solution I talk about in my tutorials differs slightly from Tom's solution in the original post. The performance comparison chart I mentioned compares STRAGG with my version of sys_connect_by_path, not Tom's.

Sorry for any confusion.

Sat Mar 24, 12:34:00 AM EDT  

Anonymous Freek said....

Laurent,

In your test, you must also consider that the analytic/sys_connect_by_path solution is giving you an ordered string aggregate, which is not true for the original stragg solution.

regards

Freek

Sun Mar 25, 05:26:00 AM EDT  

Anonymous Anonymous said....

One approach is to create and use a simple dynamic-SQL function called "listify":

SELECT DISTINCT e.deptno,
listify('select ename from emp e2 where e2.deptno='||e.deptno,' ')
FROM emp e

CREATE OR REPLACE FUNCTION listify
(sql_stmt_in IN VARCHAR2,
separator_in VARCHAR2)
RETURN VARCHAR2
/*
* Created 3/22/07 J. Martin to return a single value from a list of arbitrary items returned by
* an input SQL statement.
* The input SQL statement should select exactly one character column, less than 4000 characters.
* The returned value is up to 4000 characters, if truncation occurs then an ellipsis (...) is
* appended after the last value that can be fit into 4000 characters.
*/
IS
return_string VARCHAR2(4000):='';
list_item VARCHAR2(4000):='';
TYPE cur_typ IS REF CURSOR;
get_list cur_typ;
BEGIN
OPEN get_list FOR sql_stmt_in;
LOOP
FETCH get_list INTO list_item;
EXIT WHEN get_list%NOTFOUND;

IF list_item IS NOT NULL THEN
IF LENGTH(list_item) + NVL(LENGTH(return_string),0) + NVL(LENGTH(separator_in),0) + 4 < 4000 THEN
IF return_string IS NULL
THEN return_string := list_item;
ELSE return_string := return_string || separator_in || list_item;
END IF;
ELSE return_string := return_string || ' ...';
EXIT;
END IF;
END IF;
END LOOP;

CLOSE get_list;
RETURN(return_string);
END;
/

This requires a parse for each record, but is simple and has linear performance.

Thoughts?

Sun Mar 25, 11:38:00 PM EDT  

Blogger Shuchi said....

I find it neat to use SQLX for pivoting as below. What do you think of it?

select deptno,
substr(replace(replace(xmlagg(xmlelement("x",ename) order by ename),'</x>'),'<x>',' '),2) scbp
from emp
group by deptno
order by deptno;

Mon Mar 26, 03:51:00 AM EDT  

Anonymous Anonymous said....

SnippetyJoe:
thanx for pointing out
connect_by_isleaf

Mon Mar 26, 03:57:00 AM EDT  

Blogger Hariharan Ragunathan said....

Hi Tom the same stringing rows I posted in my blog some time back on topic Concatenating Rows in a Table...

After seeing your post I was happy that I have not explained it wrong..

Mon Mar 26, 09:41:00 AM EDT  

Blogger Alberto Dell'Era said....

But I wonder - has anyone ever submitted an Enhancement Request asking to implement a string concatenation grouping/analytic function ?

Something like xmlagg() which is the neatest solution in my opinion (but watch out for special characters such as "greater than" and "ampersand", that gets converted to their HTML/XML s escape representation ;) )

Mon Mar 26, 03:19:00 PM EDT  

Anonymous Frank Zhou said....

I think using SQL model clause ( Parallel processing using "Partition by (dept ) ") can combines the best of both worlds.

Frank



SELECT fin_str
FROM (
SELECT fin_str, dept, name FROM t
MODEL
PARTITION BY (dept)
DIMENSION BY (row_number() over (PARTITION BY dept ORDER BY name) rn )
RULES (
name_Str[ANY] ORDER BY rn =
CASE WHEN name[cv() - 1 ] IS NULL
THEN name[cv()]
ELSE name_Str[cv()-1]||','|| name[cv()]
END,
fin_str[ANY] ORDER BY rn = CASE WHEN name[cv()+1] IS NULL
THEN name_Str[cv()]
END
)
WHERE fin_str is not null


Or

http://www.jlcomp.demon.co.uk/faq/Stragg_Queries.html

Mon Mar 26, 04:15:00 PM EDT  

Blogger Vadim Tropashko said....

From http://vadimtropashko.files.wordpress.com/2007/02/ch3.pdf

with concat_enames as (
select deptno, sys_connect_by_path(ename,',') aggr, level depth
from emp e
start with ename=(select min(ename) from emp ee
where e.deptno=ee.deptno)
connect by ename > prior ename and deptno = prior deptno
) select deptno, aggr from concat_enames e
where depth=(select max(depth) from concat_enames ee
where ee.deptno = e.deptno);

Mon Mar 26, 07:57:00 PM EDT  

Anonymous Anonymous said....

alberto:
it looks to me nobody here needs an
enhancement due to the abundance of solutions.

have you submitted an enhancement request?

Tue Mar 27, 03:20:00 AM EDT  

Blogger Alberto Dell'Era said....

Anonymous:
it looks to me that every solution here has its issues (or complications): "stragg" is neat but requires pre-install and context switches to the pl/sql engine; "connect by level" syntax is not intuitive plus the other gotchas; "model" clause syntax is a tad ugly; the closest to ideal is xmlagg, but has its gotchas as well with special characters.

Nothing would beat a natively implemented function modeled after xmlagg(); say:

select concat (ename order by ename) from emp;

and the analogous analytic function.

I have submitted no ER.

Tue Mar 27, 06:11:00 AM EDT  

Anonymous Shop said....

Very interesting!

Sat Aug 04, 10:23:00 AM EDT  

Blogger Alberto Dell'Era said....

A tip for people interested in the xmlagg() trick - xmlelement() converts the 5 XML special characters (ampersand, less than, greater than, single-quote and double-quote) in their alternative entity rapresentation.

To get them back after they are aggregated by xmlagg(), there's the handy dbms_xmlgen.convert() utility - just pass 1 as the second parameter.
Got the dbms_xmlgen trick from Paweł Barut.

Sat Dec 15, 10:43:00 AM EST  

Anonymous Anonymous said....

Very Good article , this article make some interesting points.
Tactical Flashlights
r c helicopter
video game
Tactical Flashlight

Mon Jun 23, 02:46:00 AM EDT  

Anonymous Anonymous said....

it is very good.I love Oracle.Thanks

Wed Dec 24, 08:04:00 AM EST  

Blogger sexy said....

情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣,情趣,情趣,情趣,情趣,情趣,情趣用品,情趣用品,情趣,情趣,A片,A片,情色,A片,A片,情色,A片,A片,情趣用品,A片,情趣用品,A片,情趣用品,a片,情趣用品

A片,A片,AV女優,色情,成人,做愛,情色,AIO,視訊聊天室,SEX,聊天室,自拍,AV,情色,成人,情色,aio,sex,成人,情色

免費A片,美女視訊,情色交友,免費AV,色情網站,辣妹視訊,美女交友,色情影片,成人影片,成人網站,H漫,18成人,成人圖片,成人漫畫,情色網,日本A片,免費A片下載,性愛

情色文學,色情A片,A片下載,色情遊戲,色情影片,色情聊天室,情色電影,免費視訊,免費視訊聊天,免費視訊聊天室,一葉情貼圖片區,情色視訊,免費成人影片,視訊交友,視訊聊天,言情小說,愛情小說,AV片,A漫,AVDVD,情色論壇,視訊美女,AV成人網,成人交友,成人電影,成人貼圖,成人小說,成人文章,成人圖片區,成人遊戲,愛情公寓,情色貼圖,色情小說,情色小說,成人論壇

a片下載,線上a片,av女優,av,成人電影,成人,成人貼圖,成人交友,成人圖片,18成人,成人小說,成人圖片區,成人文章,成人影城,成人網站,自拍,尋夢園聊天室

A片,A片,A片下載,做愛,成人電影,.18成人,日本A片,情色小說,情色電影,成人影城,自拍,情色論壇,成人論壇,情色貼圖,情色,免費A片,成人,成人網站,成人圖片,AV女優,成人光碟,色情,色情影片,免費A片下載,SEX,AV,色情網站,本土自拍,性愛,成人影片,情色文學,成人文章,成人圖片區,成人貼圖

Wed Feb 04, 09:37:00 AM EST  

Blogger R.V.Reddy said....

Hi,
I am trying to get the user objects by procedure,tables and others( turning rows in to columns for each object_type) using below query. But all objects are displaying along with NULL's. How can I get rid of the NULL values between the objects.


SELECT TABLE_NAMES,PROCEDURE_NAMES,VIEW_NAMES,INDEX_NAMES
FROM (
SELECT
MAX(CASE WHEN OBJECT_TYPE='TABLE' THEN OBJECT_NAME END) TABLE_NAMES,
MAX(CASE WHEN OBJECT_TYPE='PROCEDURE' THEN OBJECT_NAME END) PROCEDURE_NAMES,
MAX(CASE WHEN OBJECT_TYPE='VIEW' THEN OBJECT_NAME END) VIEW_NAMES,
MAX(CASE WHEN OBJECT_TYPE='INDEX' THEN OBJECT_NAME END) INDEX_NAMES
FROM USER_OBJECTS
GROUP BY OBJECT_NAME);

Thanks,
Venkat

Fri Mar 13, 02:50:00 PM EDT  

Blogger Thomas Kyte said....

@Venkat

please specify your question in the form of a specification, similar to one you would give a programmmer, pretend I'm your "mother", explain it in enough detail with examples/whatever so it is understandable.

I'm pretty sure I know what you mean, I think - I THINK - you want four columns - each with a list of as many objects of that type as there are - sort of like you might have in a spreadsheet if you typed this stuff in - but I want you to enter a specification that one can generate code from.

Fri Mar 13, 03:13:00 PM EDT  

Blogger Deb said....

I like this example and I have a case where my strings I need to concatenate are 4000 in size, therefore the resulting string becomes a clob. Also in the mix is that the table starts with 18 million records, and the resulting table once fields are concatenated will be 6+ million records. Is there a way to do this that will not take hours and hours and hours? I've been trying the function route, but cannot get it to finish after 7+ hours of running. I'd like to find a solution that is at least under 4 hours. Any ideas are appreciated.

Mon Mar 23, 06:36:00 PM EDT  

Blogger Thomas Kyte said....

@Deb,

you would on average turn 3 records into 1.

why bother? Why string these up? What would that accomplish?

What is the goal here.

Mon Mar 23, 09:41:00 PM EDT  

Blogger Deb said....

I have to concatenate them because they are going in Matrix One as 1 single attribute on an object. This is a data migration from an oracle home-grown system to a full install of Matrix One (oracle-based) and we are loading the MatrixOne database via a tool that generates sql loader files. It's a long story but concatenating into a single column is the only solution for this piece of data.

Fri Mar 27, 07:19:00 PM EDT  

Blogger Thomas Kyte said....

@deb

if you


select max(sum_len)
from(
select [columns you would group by], sum(length(column-to-concat)) sum_len
...
)

what is the max length?

Fri Mar 27, 07:26:00 PM EDT  

Blogger Deb said....

Hi Tom

The max length on some of the records exceeds 32767. Since my last post to you, I have converted code to use DBMS_SQL package and I can successfully insert into the LONG column anything < 32767. However, I am still stuck with the stuff > 32767. I tried piece-wise inserting by concatenation where my dynamical sql statement with bind variables looked like:

"...,:b1||:b2||:b3...)" where that is the column that is the LONG and each of the variables going into b1, b2, b3...were 32767 sized chunks of the original text. This didn't work either.

Seems like I am at the point were I need to consider SQLLDR or something else, not sure. Seems like this part cannot be done in PL/SQL. Is that correct? If so, what are my options? I cannot change the target schema because MatrixOne does not use CLOBs, only LONGs.

Any help is appreciated.

Tue Mar 31, 12:38:00 PM EDT  

Blogger Thomas Kyte said....

@deb,

your options are very (extremely) limited.

PLSQL - not a chance, it can piecewise READ a long, but that is it.

SQL - not directly, not just with SQL, we'll need an API


You will have to write code in some 3gl (java, C, VB, whatever) to do this. You could either

a) produce a file for sqlldr from this 3gl - plsql cannot produce it because plsql cannot generate a line exceeding 32k (plsql will inject a linebreak whether you want it or not)

b) have the program read the rows, construct the "long" and then "stream" it into the other table.

Tue Mar 31, 03:51:00 PM EDT  

Anonymous Anonymous said....

Very usefule example.
I have a scenario in which i have to do multiple aggregation. i.e. Something which displays
Name----------Likes---------Dislikes
Tom.......Apple, Orange....Mango, Banana
Bob........Blue Berry.........Apple, Banana

As you can see there are multiple columns with aggregated stuff. Any idea how to extend this approach to achieve it.

Fri Jul 24, 07:48:00 PM EDT  

Blogger Thomas Kyte said....

@Anonymous

simple, just use sys_connect_by_path more than once

Sat Jul 25, 03:02:00 AM EDT  

Blogger Alberto Dell'ERa said....

They have finally implemented a version of STRAGG natively in 11GR2 as LISTAGG. Very nice, I wonder whether this blog discussion has contributed to the design ;)

Thu Sep 03, 04:46:00 AM EDT  

Anonymous Anonymous said....

Hi,

I'm having a hard time trying to work this out when trying to do multiple joins. Any tips? I am new to Oracle, so any optimization red flags appreciated as well.

SELECT DISTINCT d.ppuid ,
d.product_type ,
d.provider ,
d.title ,
SUBSTR(MAX(REPLACE(SYS_CONNECT_BY_PATH(metadata_value, '/'),'/','; ')),2) Concatenated_Keywords
FROM qpcin_rr_cat.cat_prod_metadata_pub a,
qpcin_rr_cat.cat_product_pub b ,
qpcin_rr_repo.inv_ingestion_event d
WHERE a.product_pub_id = b.id
AND b.uuid = d.product_uuid
AND d.status = 'SUCCESS'
AND a.metadata_name = 'keywords'
(SELECT cpmp.*,
row_number() OVER (order by seq_num) ROW#
FROM QPCIN_RR_CAT.CAT_PROD_METADATA_PUB cpmp
WHERE CPMP.PRODUCT_PUB_ID = 132214423
AND metadata_name = 'keywords'
) START
WITH ROW#=1 CONNECT BY PRIOR row# = row# -1


I'm using oracle developer.

Tue Nov 17, 06:45:00 PM EST  

Blogger Thomas Kyte said....

@last anonymous
No one can answer your question, because ....

we don't know what you are really trying to do.

But, let's say you have a query


select c1, c2, c3, c4,..., cn, STRING
from .....
where .....


And you wanted to get a row for each distinct c1, c2, c3, c4, ... cn and havae STRING be "concatenated"


that would be

select c1, c2, ... , cn,
max(sys_connect_by_path(STRING, ' ' )) scbp
from (

select c1, ... , cn, string,
row_number() over ( partition by c1, ..., cn order by string )rn
from ....
where ....

)
start with rn = 1
connect by prior rn = rn-1 and prior c1 = c1 and prior c2 = c2 and .... prior cn = cn
group by c1, ... cn
order by c1, ... cn
10 /

Tue Nov 17, 07:23:00 PM EST  

Blogger Hamsandwhich said....

Hi -- (Hamsandwhich here,aka most recent anonymous guy). Thanks for the tips. I'll explain a bit more what I am trying to do. I have a table that has a field named 'metadata_name', and another field called 'metadata_value'. For each product, I want to only pull metadata_name fields that are called 'keywords', and concatenate any of the metadata_value fields into a single string. So instead of this:

keywords cat
keywords hat
keywords bat

I want keywords cat|hat|bat

A couple things that are tripping me up though. The first is that I have an unknown number of keywords that need to be concatenated, and a large number of products ( ppuids) that would have their own corresponding concatenated strings. To make things extra fun, I need to access some other tables to get contextual info like product title, etc. So I think I want to do a select of selects, and also join a few tables. Not sure how to integrate the your example when I want to output more fields than just the field that has the concatenated values?

To use your example, it would be like adding address and phone numbers to your ename data... if you had to do your example while accessing other tables to get address/phone numbers, how would you do that?

Hope that made sense. :)

Wed Nov 18, 11:20:00 AM EST  

Blogger Thomas Kyte said....

@Hamsandwhich said...

I can only generalize so far, you draw a simple example:



keywords cat
keywords hat
keywords bat

I want keywords cat|hat|bat


*and then* start throwing in things like ppuid (what?) and "access some other tables to get contextual info like product title, etc. "


I tried to generalize - I'll do it again, only verbally this time.


Get your query (the one that returns all of the data)

segregate your columns into one of two categories

category a: your "key", things to group by

category b: your columns to aggregate - the things to string up.


now, add to that list row_number() over (partition by CATEGORYA order by WHATEVER YOU WANT)


then

select categorya,
max(sys_connect_by_path())
max(sys_connect_by_path())
....
/* max as many times as you have elements in categoryb */
from (YOUR_QUERY_WITH_ROW_NUMBER)
start with rn =1
connect by rn = prior rn+1
and categorya_c1 = prior categorya_c1 ....
group by categorya;



Unless you give a CONCRETE example with "YOUR_QUERY", this is as far as well get.

Wed Nov 18, 11:30:00 AM EST  

Blogger Hamsandwhich said....

Okay, thanks. Sorry for the vagueness. I appreciate the help.

I found that this query works when looking at a single item:

SELECT metadata_name,
MAX(sys_connect_by_path(metadata_value, '|' )) concatenated_keywords
FROM
(SELECT cpmp.*,
row_number() over (partition BY metadata_name order by metadata_value) rn
FROM qpcin_rr_cat.cat_prod_metadata_pub cpmp
WHERE cpmp.metadata_name = 'keywords'
AND cpmp.product_pub_id = '132214423'
) START
WITH rn = 1 CONNECT BY prior rn = rn-1
AND prior metadata_name = metadata_name
GROUP BY metadata_name
ORDER BY metadata_name


The output looks like this:

keywords |HOG|Harley|Harley-Davidson|american|biker|bikes|legend|logo|motorcyles|power|tough

This query/output represents one item of my table though. What if I want to remove the part of the query restricting the results to id '132214423', and want the output to include all ids? In other words, I want to group all the keyword data together, but I want to do that for every id. A grouping of groupings. I'm trying to figure out how to select the results from the first select. Does that make sense?

Wed Nov 18, 12:10:00 PM EST  

Blogger Thomas Kyte said....

@Hamsandwhich said...

just change

metadata_name

to

metadata_name, product_pub_id

in the query (your categoryA is metadata_name, id)

and change the connect by to include product_pub_id



SELECT metadata_name, product_pub_id,
MAX(sys_connect_by_path(metadata_value, '|' )) concatenated_keywords
FROM
(SELECT cpmp.*,
row_number() over (partition BY metadata_name, product_pub_id order by metadata_value) rn
FROM qpcin_rr_cat.cat_prod_metadata_pub cpmp
WHERE cpmp.metadata_name = 'keywords'

) START
WITH rn = 1 CONNECT BY prior rn = rn-1
AND prior metadata_name = metadata_name and prior product_pub_id = product_pub_id
GROUP BY metadata_name, product_pub_id
ORDER BY metadata_name, product_pub_id

Wed Nov 18, 12:17:00 PM EST  

Blogger Verily said....

AH... thank you very much sir! I think I get it now.

Wed Nov 18, 12:28:00 PM EST  

Anonymous Rin said....

Hi Tom,

I came across your preferred method of string concatenation and wonder if you could advise me as to whether my implementation is any better than using your STRAGG approach. I like your preferred method but it seems somewhat slower than STRAGG. I include the code for both (apologies for the amount of it and format).
STRAGG SOLUTION:
SELECT DISTINCT
p.person_id,p.surname,p.first_name,student.academic_year,
stragg(student.enrolment_status_code) over (partition by student.person_id) enrolment_status_code,
DECODE (staff.person_id, '', 'N', 'Y') is_staff,
DECODE (student.person_id, '', 'N', 'Y') is_student,
DECODE (SUBSTR (student.course_code, 2, 1), 'U', 'Y', 'N') is_undergraduate,
DECODE (SUBSTR (student.course_code, 2, 1), 'U', 'N', NULL, 'N','Y')is_postgraduate
FROM person@dprod p,
person_id_admin@dprod pia,
staff_appointment@dprod staff,
(SELECT academic_year,
enrolment_status_code,
person_id,
course_code
FROM student_registration@dprod s
WHERE SYSDATE BETWEEN TO_DATE ('01-08-'|| SUBSTR (academic_year,1,2),'dd-mm-yy') AND TO_DATE ('01-08-'|| SUBSTR (academic_year,4,2),'dd-mm-yy')+ 65 AND enrolment_status_code NOT IN ('T', 'WD', 'S', 'SD', 'NS')) student
WHERE p.person_id = pia.person_id
AND pia.person_id = staff.person_id(+)
AND pia.person_id = student.person_id(+)
AND SYSDATE <= NVL (p.leaving_date, SYSDATE)
AND NVL (p.start_date, SYSDATE) <= SYSDATE;

CONNECT BY PRIOR SOLUTION:
SELECT DISTINCT p.person_id, p.surname,p.first_name,student.academic_year,student.enrolment_status_code,DECODE (staff.person_id, '', 'N', 'Y') is_staff,NVL (student.is_student, 'N') is_student,NVL (student.is_undergraduate, 'N') is_undergraduate, NVL (student.is_postgraduate, 'N') is_postgraduate
FROM person@dprod p,
person_id_admin@dprod pia,
staff_appointment@dprod staff,
( SELECT academic_year,MAX (SYS_CONNECT_BY_PATH (enrolment_status_code, ',')) enrolment_status_code,person_id,is_student,is_undergraduate,is_postgraduate
FROM (SELECT academic_year,enrolment_status_code,person_id,course_code,ROW_NUMBER () OVER (PARTITION BY person_id ORDER BY person_id) rn,
DECODE (s.person_id, '', 'N', 'Y') is_student,
DECODE (SUBSTR (s.course_code, 2, 1),'U', 'Y','N') is_undergraduate,
DECODE (SUBSTR (s.course_code, 2, 1),'U', 'N',NULL, 'N','Y')is_postgraduate
FROM student_registration@dprod s
WHERE SYSDATE BETWEEN TO_DATE ('01-08-'|| SUBSTR (academic_year,1,2),'dd-mm-yy') AND TO_DATE ('01-08-'|| SUBSTR (academic_year,4,2),'dd-mm-yy')+ 65 AND enrolment_status_code NOT IN ('T', 'WD', 'S', 'SD', 'NS'))
START WITH rn = 1
CONNECT BY PRIOR rn = rn - 1 AND PRIOR person_id = person_id
GROUP BY person_id,academic_year,is_student,is_undergraduate,is_postgraduate) student
WHERE p.person_id = pia.person_id
AND pia.person_id = staff.person_id(+)
AND pia.person_id = student.person_id(+)
AND SYSDATE <= NVL (p.leaving_date, SYSDATE)
AND NVL (p.start_date, SYSDATE) <= SYSDATE;

I'm not the most confident of coders and would appreciate your views very much,

many thanks

Rin

Wed Apr 21, 04:22:00 AM EDT  

Blogger Thomas Kyte said....

I have found stragg to be pleasantly efficient - and pretty speedy.

I use the max(sys_connect_by_path) trick when not allowed to install stragg for whatever reason

and in 11g - I would replace both with listagg if at all possible. Listagg is similar to stragg and is bulitin in 11g.


So, in order of preference

listagg in 11g
stragg
max(sys_connect_by_path)

Wed Apr 21, 08:24:00 AM EDT  

Anonymous Rin said....

Hi Tom,

Thanks for your feedback, very helpful.

kind regards

Rin

Thu Apr 22, 03:50:00 AM EDT  

POST A COMMENT

<< Home