Category Archives: xml

xml and powershell : using XPATH

I wrote about powershell [xml] yesterady : xml and powershell

Let’s see how to use XPATH expressions in Powershell


<emplist>
  <emp no="1">
    <ename>John</ename>
  </emp>
  <emp no="2">
    <ename>Jack</ename>
  </emp>
</emplist>

With the [xml] datatype, we create a navigator :

(([xml](GC foo.xml)).psbase.createnavigator().evaluate(
'//emplist/emp[@no="1"]/ename/text()'
))|%{$_.Value}

John

I have not been seduced by a Microsoft product since ages, but I must say I felt with love in this goody much more than in perl, cygwin, or whatever python, dos, java, vb…

It is simply great to use on the command line and can do my work.

1:0 for Microsoft

extract xml from the command line

I just discovered this morning this cool utility in my /bin directory : xmllint

You can use it to extract values from your xml files within your shell scripts


$ cat foo.xml
<emplist>
  <emp no="1">
    <ename>John</ename>
  </emp>
  <emp no="2">
    <ename>Jack</ename>
  </emp>
</emplist>
$ echo 'cat //emplist/emp[@no="1"]/ename/text()'|
  xmllint --shell foo.xml |
  sed -n 3p
John

I like this !

select from comma-separated list

This is asked over and over in the forums, but why not proposing an 11g solution here ;)

create table t(description varchar2(12) primary key, 
  numbers varchar2(4000));
insert into t(description, numbers) values ('PRIME','2,3,5,7');
insert into t(description, numbers) values ('ODD','1,3,5,7,9');
commit;

DESCRIPTION NUMBERS
PRIME 2,3,5,7
ODD 1,3,5,7,9

Now I want to unpivot numbers in rows


select description,(column_value).getnumberval()  
from t,xmltable(numbers)

DESCRIPTION (COLUMN_VALUE).GETNUMBERVAL()
PRIME 2
PRIME 3
PRIME 5
PRIME 7
ODD 1
ODD 3
ODD 5
ODD 7
ODD 9

It is that simple :)

Works also with strings :


select (column_value).getstringval() 
from xmltable('"a","b","c"');

(COLUMN_VALUE).GETSTRINGVAL()
a
b
c

high cost

What’s wrong with this query ?


select
(t6.column_value).getstringval() t6
from
table(xmlsequence(extract(xmltype(‘<x/>’),’/x’))) t1,
table(xmlsequence(t1.column_value))t2,
table(xmlsequence(t2.column_value))t3,
table(xmlsequence(t3.column_value))t4,
table(xmlsequence(t4.column_value))t5,
table(xmlsequence(t5.column_value))t6;
T6
————————
<x/>

Elapsed: 00:00:00.01

Well, let’s check the plan :


--------------------------------------------------------------------
| Id  | Operation          | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------
|   0 | SELECT STATEMENT   |    18E|    15E|    18E  (0)|999:59:59 |
|   1 |  NESTED LOOPS      |    18E|    15E|    18E  (0)|999:59:59 |
|   2 |   NESTED LOOPS     |    18E|    15E|    99P  (3)|999:59:59 |
|   3 |    NESTED LOOPS    |  4451T|    31P|    12T  (3)|999:59:59 |
|   4 |     NESTED LOOPS   |   544G|  3045G|  1490M  (3)|999:59:59 |
|   5 |      NESTED LOOPS  |    66M|   254M|   182K  (3)| 00:36:31 |
|   6 |       COLLECTION I |       |       |            |          |
|   7 |       COLLECTION I |       |       |            |          |
|   8 |      COLLECTION IT |       |       |            |          |
|   9 |     COLLECTION ITE |       |       |            |          |
|  10 |    COLLECTION ITER |       |       |            |          |
|  11 |   COLLECTION ITERA |       |       |            |          |
--------------------------------------------------------------------

It is returning 18 quadrillions of rows, 15 exabytes, the cost is 1.8E19 and the time is about one month :mrgreen:

old-fashion listagg

Yesterday I had my first session about XML, today I have one about SQL Model

Ok, it was the first time I spoke about XML so I did not really now where to focus. XML is so big, you have XQUERY, XPATH, dozens of XML functions in the database.

One of the XML function is called XMLTRANSFORM and transforms XML according to XSLT

I had a fun demo about XSLT to create a semi-column separated list :


select
   deptno,
   xmltransform
   (
      sys_xmlagg
      (
         sys_xmlgen(ename)
      ),
     xmltype
     (
       '<?xml version="1.0"?><xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
          <xsl:template match="/">
            <xsl:for-each select="/ROWSET/ENAME">
              <xsl:value-of select="text()"/>;</xsl:for-each>
          </xsl:template>
        </xsl:stylesheet>'
     )
  ).getstringval() listagg
from emp
group by deptno;


 DEPTNO LISTAGG
------- --------------------------------------
     10 CLARK;KING;MILLER;
     20 SMITH;FORD;ADAMS;SCOTT;JONES;
     30 ALLEN;BLAKE;MARTIN;TURNER;JAMES;WARD;

select in HTML format, XQUERY variant

Same as Select in html format, but with XMLTABLE instead of XMLTRANSFORM



select 
  xmlroot(
    xmlelement(
      "table",
      xmlconcat(
        xmltype(
'<tr><th>DEPTNO</th><th>DNAME</th><th>LOC</th></tr>'),
        xmlagg(column_value)
      )
    ),version '1.0'
  )
from 
  xmltable('
    for $f in ora:view("LSC_DEPT") 
    return 
      <tr>
        <td>{$f/ROW/DEPTNO/text()}</td> 
        <td>{$f/ROW/DNAME/text()}</td> 
        <td>{$f/ROW/LOC/text()}</td>
      </tr>');

XMLROOT(XMLELEMENT("TABLE",XMLCONCAT(XMLTYPE('<TR>
--------------------------------------------------
<?xml version="1.0"?>                             
<table>                                           
  <tr>                                            
    <th>DEPTNO</th>                               
    <th>DNAME</th>                                
    <th>LOC</th>                                  
  </tr>                                           
  <tr>                                            
    <td>10</td>                                   
    <td>ACCOUNTING</td>                           
    <td>NEW YORK</td>                             
  </tr>                                           
  <tr>                                            
    <td>20</td>                                   
    <td>RESEARCH</td>                             
    <td>DALLAS</td>                               
  </tr>                                           
  <tr>                                            
    <td>30</td>                                   
    <td>SALES</td>                                
    <td>CHICAGO</td>                              
  </tr>                                           
  <tr>                                            
    <td>40</td>                                   
    <td>OPERATIONS</td>                           
    <td>BOSTON</td>                               
  </tr>                                           
</table>                                          

select in HTML format

Last Wednesday I selected data from an HTML table : select from xml

Today, let’s try the opposite, generate an HTML table from a SQL query


select 
  XMLSERIALIZE(
    DOCUMENT
    XMLROOT(
      XMLTRANSFORM(
        XMLTYPE(
          CURSOR(
           SELECT * FROM DEPT
          )
        ),
        XMLTYPE.CREATEXML(
      '<?xml version="1.0"?>
<xsl:stylesheet version="1.0" 
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >
<xsl:template match="/">
  <html>
    <head>
      <title>Table DEPT</title>
    </head>
    <body>
    <p>Table DEPT in HTML format</p>
    <table border="1">
      <tr>
        <th align="right">DEPTNO</th>
        <th align="left">DNAME</th>
        <th align="left">LOC</th>
      </tr>
<xsl:for-each select="/ROWSET/ROW">
      <tr>
        <td align="right"><xsl:value-of select="DEPTNO"/></td>
        <td align="left"><xsl:value-of select="DNAME"/></td>
        <td align="left"><xsl:value-of select="LOC"/></td>
      </tr>
</xsl:for-each>
    </table>
  </body>
  </html>
</xsl:template>
</xsl:stylesheet>'
        )
      )
    , VERSION '1.0')
  )
from DUAL;

Table DEPT in HTML format

DEPTNO DNAME LOC
10 ACCOUNTING NEW YORK
20 RESEARCH DALLAS
30 SALES CHICAGO
40 OPERATIONS BOSTON

Largely inspired from xmlfiles.com

select from xml

My wife is so happy that an African president is ruling the world for the next four years !

Ok, here is a simple exercice :

How do you select from an HTML file ?

Take an example

The table of departments

10 ACCOUNTING NEW YORK
20 RESEARCH DALLAS
30 SALES CHICAGO
40 OPERATIONS BOSTON


create table T of XMLTYPE;

insert into T values(
  XMLTYPE(
'<?xml version="1.0" encoding="ISO-8859-1"?>
<html>
  <head>
    <title>Employee table</title>
  </head>
  <body>
    <p>The table of departments</p>
    <table border="1">
      <tr>
        <td>10</td>
        <td>ACCOUNTING</td>
        <td>NEW YORK</td>
      </tr>
      <tr>
        <td>20</td>
        <td>RESEARCH</td>
        <td>DALLAS</td>
      </tr>
      <tr>
        <td>30</td>
        <td>SALES</td>
        <td>CHICAGO</td>
      </tr>
      <tr>
        <td>40</td>
        <td>OPERATIONS</td>
        <td>BOSTON</td>
      </tr>
    </table>
  </body>
</html>'));

select deptno, dname, loc
from T t,
xmltable(‘/html/body/table/tr’
passing value(t) columns
deptno number path ‘/tr/td[1]‘,
dname varchar2(10) path ‘/tr/td[2]‘,
loc varchar2(10) path ‘/tr/td[3]‘
);


    DEPTNO DNAME      LOC
---------- ---------- ----------
        10 ACCOUNTING NEW YORK
        20 RESEARCH   DALLAS
        30 SALES      CHICAGO
        40 OPERATIONS BOSTON

xmltable is a 10gR2 features. For those with 9i, use TABLE(XMLSEQUENCE(EXTRACT(VALUE(t),’/html/body/table/tr’))) instead.

Note the html file must be in XHTML format, some rewritting may imply. SQLPLUS for example does not generate html files that are valid XHTML documents, for instance because of the <p> tags that to not have a </p> tag. Note that xml tags are case sensitive.

alert log in xml format

The alert log is in xml format in Oracle 11g. If you want to parse the content of the file and use the XML functions to retrieve the data, you can use this way :

SQL> create or replace directory alert as
  2    '/app/oracle/diag/rdbms/lsc01/lsc01/alert';

Directory created.

SQL> var c clob
SQL> declare
  2     b bfile := bfilename('ALERT','log.xml');
  3  begin
  4     dbms_lob.open(b,dbms_lob.file_readonly);
  5     dbms_lob.createtemporary(:c,true);
  6     dbms_lob.loadfromfile(:c,b,dbms_lob.lobmaxsize);
  7  end;
  8  /

PL/SQL procedure successfully completed.

SQL> select extractvalue(xmlparse(content :c),
  2    '/msg[@time="2008-03-30T01:01:13.704+01:00"]/txt/text()')
  3  from dual;
EXTRACTVALUE(XMLPARSE(CONTENT:C),'/MSG[@TIME="2008-03-30T01:01:1
----------------------------------------------------------------

Starting ORACLE instance (normal)

Obfuscation contest

I have been challenged to participated (my script : lsc2.sql) and now challenged to explain my query by Chen
Chen Shapira

Ok, I give a try.

To make the query unreadable and unformatable I used no space, no new line, and I started by q’<'>‘ to confuses sqlinform.

Ok, I added undocumented constructs like

connect by .1 < 1.
connect by .25 > rownum/100.

with 2 connect by, the first is ignored. The second is buggy in 10gR1 and returns 25 rows in 10gR1 (but 24 rows in 9i, 10gR2, 11g).

and group by () which means only one group. Note this clause is mandatory, you cannot remove it when using an aggregate function and a scalar subquery like in
select (select 1 from dual), count(*) from dual group by ();

The utl_raw and to_char converts a number to the string Oracle Community. I will keep terse on the formula.
The scalar subquery inside TAN returns 1. The regexp removes some characters out of the extract tag. _x0032_ is the alias of the column “2”. The v$reservedwords use UNIQUE to sort the rows (UNIQUE was sorting in 10gR1 for UNIQUE/DISTINCT) and the MAX(rownum) retrieve the 316th row (FROM) and 845th row (SELECT). Remember ROWNUM is evaluated BEFORE aggregation, therefore ROWNUM 845 exists! Note the “+” is an alias and not a string!

With the SELECT and FROM keyword I build a new query that I dynamically evaluates with dbms_xmlgen.getxml. Rest of the first part is fancy calculation.

The dburi is also dynamic.

dburitype(
'/SYS/DBA_PROCEDURES/ROW[PROCEDURE_NAME=''GETKEY'']/OBJECT_NAME'
).getxml()

Well, I could have obfuscated the text a bit more but I was short of time… This simply return the name of the package that have a procedure called GETKEY. Which is DBMS_OBFUSCATION_TOOLKIT_FFI. This is not very clean as the query would fail if you have a PROC.GETKEY in your schema. Well, I wanted to add this not very well know mechanism to query the database.

Last part is in dict, I select the pattern ‘[COTTON+XE]{4,}’ in dict and return the Mode (or modal score), which is CONTEXT.

A bit formatting and I get

"NVL2"(Q'<'>',"UTL_RAW".CAST_TO_VARC
------------------------------------
Oracle Community_Obfuscation_Contest

isNumber in sql

I tried this in 11g
TABLE T

X
123
-1.2e-3
abc


select x, 
  to_number(
    xmlquery('number($X)' 
      passing x as x 
      returning content)) n 
from t;
X                N
------- ----------
123            123
-1.2e-3     -.0012
abc

it is quite a common task to extract numbers from varchar2 and to dig out poor quality data.

select x, to_number(x) from t;
ERROR:
ORA-01722: invalid number

A well-known PL/SQL approach would be to use exception. Ex:


create or replace function f(x varchar2) 
return number is 
begin return to_number(x); 
exception when others then return null; 
end;
/
select x, f(x) n from t;
X                N
------- ----------
123            123
-1.2e-3     -.0012
abc

another approach in plain sql could involve CASE and REGEXP

select x,
case when
regexp_like(x,
‘^-?(&#92+&#92.?|&#92d*&#92.&#92d+)([eE][+-]&#92d+)?$’)
then to_number(x)
end n
from t;
X N
——- ———-
123 123
-1.2e-3 -.0012
abc

get Nth column of a table

I answered this question twice, once on otn forums and once on developpez.net

Here is the latest to get the third column of emp

select
column_name as name,
extractvalue(column_value,’/ROW/’||column_name) as value
from table(xmlsequence(cursor(select * from emp))),
user_tab_columns
where COLUMN_ID=3 and table_name=’EMP’
;


NAME VALUE     
---- ----------
JOB  CLERK     
JOB  SALESMAN  
JOB  SALESMAN  
JOB  MANAGER   
JOB  SALESMAN  
JOB  MANAGER   
JOB  MANAGER   
JOB  ANALYST   
JOB  PRESIDENT 
JOB  SALESMAN  
JOB  CLERK     
JOB  CLERK     
JOB  ANALYST   
JOB  CLERK     

probably useless, but fun ;)

select from a comma-separated string

This is one question I solved today with XML on the developpez.net/forums

I have a table T

Un,Trois,Cinq,Six
Un,Deux,Quatre
Trois
Sept,Huit
Un,Six

I want to select un,deux,trois,quatre,cinq,six,sept,huit.

I could well have written a plsql function like


create or replace type c as object (value varchar2(40));
/
create or replace type t_c as table of c;
/
sho err
create or replace function f(
  list varchar2, 
  sep varchar2 default ',') 
return t_c pipelined is
  i number:=1;
begin
  loop
    if (instr(list,sep,1,i)>0)
    then
      if (i=1) then
        pipe row(c(substr(list, 
          1, instr(list,sep)-1)));
      else
        pipe row(c(substr(list, 
          instr(list,sep,1,i-1)+1,
          instr(list,sep,1,i)-
          instr(list,sep,1,i-1)-1)));
      end if;
    else
      if (i=1) then
        pipe row(c(list));
      else
        pipe row(c(substr(list, 
          instr(list,sep,1,i-1)+1)));
      end if;
      return;
    end if;
    i:=i+1;
  end loop;
end;
/
select distinct value
from t,
  table(f(c))
;

or anything using recursion or whatever.

I just tried with XML


SQL> select distinct extractvalue(column_value,’/x’)
2 from t,
3 table(xmlsequence(extract(xmltype(
4 ‘<list><x>’||replace(
5 c,’,’,'</x><x>’)||
6 ‘</x></list>’),
7 ‘/list/x’)));

EXTRACTVALUE(COLUMN_VALUE,’/X’)
——————————-
Trois
Un
Quatre
Cinq
Six
Deux
Sept
Huit

8 rows selected.

csv with XML revisited…

Special thanks to Tom for pointing and Michaels for fixing the missing manager of King in my previous post : csv format with select

Ok, here is a my PL/SQL table function.


create or replace type tv as table of
varchar2(4000);
/

create or replace function CSV(sqlQuery varchar2)
return tv pipelined is
ctx dbms_xmlgen.ctxhandle;
begin
ctx:=dbms_xmlgen.newcontext(sqlquery);
dbms_xmlgen.setnullhandling(ctx,dbms_xmlgen.empty_tag);
for f in (select
cast(
xmltransform(column_value,
XMLTYPE(‘<xsl:stylesheet version=”1.0″ xmlns:xsl’||
‘=”http://www.w3.org/1999/XSL/Transform”>’||
‘<xsl:output method=”text”/><xsl:variable name’||
‘=”new_line” select=””&#xA;”” /><xsl:template’||
‘ match=”ROWSET”><xsl:apply-templates select=”‘||
‘ROW”/></xsl:template><xsl:template match=”ROW”‘||
‘><xsl:for-each select=”*”><xsl:value-of select’||
‘=”.”/><xsl:if test=”position() != last()”><xsl’||
‘:value-of select=””,””/></xsl:if></xsl:for-‘||
‘each><xsl:value-of select=”$new_line” /></xsl:’||
‘template></xsl:stylesheet>’
)) as varchar2(4000))
c from table(xmlsequence(dbms_xmlgen.getxmltype(ctx)
.extract(‘ROWSET/ROW’))))
loop
pipe row(f.c);
end loop;
end;
/

select * from table(csv(‘select * from emp’));
COLUMN_VALUE
————————————————
7369,SMITH,CLERK,7902,17-DEC-80,800,,20
7499,ALLEN,SALESMAN,7698,20-FEB-81,1600,300,30
7521,WARD,SALESMAN,7698,22-FEB-81,1250,500,30
7566,JONES,MANAGER,7839,02-APR-81,2975,,20
7654,MARTIN,SALESMAN,7698,28-SEP-81,1250,1400,30
7698,BLAKE,MANAGER,7839,01-MAY-81,2850,,30
7782,CLARK,MANAGER,7839,09-JUN-81,2450,,10
7788,SCOTT,ANALYST,7566,19-APR-87,3000,,20
7839,KING,PRESIDENT,,17-NOV-81,5000,,10
7844,TURNER,SALESMAN,7698,08-SEP-81,1500,0,30
7876,ADAMS,CLERK,7788,23-MAY-87,1100,,20
7900,JAMES,CLERK,7698,03-DEC-81,950,,30
7902,FORD,ANALYST,7566,03-DEC-81,3000,,20
7934,MILLER,CLERK,7782,23-JAN-82,1300,,10

csv format with select *

One more trick with xml.

I want to get a semi-column separated format without having to specify the columns


alter session set nls_date_format=’YYYY-MM-DD';

Session altered.

select regexp_replace(column_value,’ *<[^>]*>[^>]*>’,';’)
from table(xmlsequence(cursor(select * from emp)));

;7369;SMITH;CLERK;7902;1980-12-17;800;20;
;7499;ALLEN;SALESMAN;7698;1981-02-20;1600;300;30;
;7521;WARD;SALESMAN;7698;1981-02-22;1250;500;30;
;7566;JONES;MANAGER;7839;1981-04-02;2975;20;
;7654;MARTIN;SALESMAN;7698;1981-09-28;1250;1400;30;
;7698;BLAKE;MANAGER;7839;1981-05-01;2850;30;
;7782;CLARK;MANAGER;7839;1981-06-09;2450;10;
;7788;SCOTT;ANALYST;7566;1987-04-19;3000;20;
;7839;KING;PRESIDENT;1981-11-17;5000;10;
;7844;TURNER;SALESMAN;7698;1981-09-08;1500;0;30;
;7876;ADAMS;CLERK;7788;1987-05-23;1100;20;
;7900;JAMES;CLERK;7698;1981-12-03;950;30;
;7902;FORD;ANALYST;7566;1981-12-03;3000;20;
;7934;MILLER;CLERK;7782;1982-01-23;1300;10;

How do i store the counts of all tables …

How do i store the counts of all tables …

My answer to the question above using dbms_xmlgen

SQL&gt; select
  2    table_name,
  3    to_number(
  4      extractvalue(
  5        xmltype(
  6 dbms_xmlgen.getxml('select count(*) c from '||table_name))
  7        ,'/ROWSET/ROW/C')) count
  8  from user_tables;

TABLE_NAME                      COUNT
------------------------------ ------
DEPT                                4
EMP                                14
BONUS                               0
SALGRADE                            5

search for a string in all tables of a schema

this is often asked on the forums. I also needed this a while ago while reverse engineering a database model.

Here is my today solution:

1) select * and extract the first column found per table with regexp (10g)

SQL> select table_name,column_name from (select rownum,table_name, regexp_substr(dbms_xmlgen.getxml(‘select * from “‘||table_name||'”‘),'<[^>]*>&string</[^<]*>’) column_name from user_tables) where length(column_name)!=0;
Enter value for string: 20
TABLE_NAME COLUMN_NAME
———- ——————————
DEPT       <DEPTNO>20</DEPTNO>
EMP        <DEPTNO>20</DEPTNO>

SQL> select table_name,column_name from (select rownum,table_name, regexp_substr(dbms_xmlgen.getxml(‘select * from “‘||table_name||'”‘),'<[^>]*>&string</[^<]*>’) column_name from user_tables) where length(column_name)!=0;
Enter value for string: KING
TABLE_NAME COLUMN_NAME
———- ——————————
EMP        <ENAME>KING</ENAME>
BONUS      <ENAME>KING</ENAME>

2) add a where condition. much slower of course, because scanning the table more often. somehow nicer output. More sensible to datatypes. Here for number.

SQL> select table_name, column_name from (select rownum,table_name, column_name, dbms_xmlgen.getxml(‘select 1 from “‘||table_name||'” where “‘||column_name||'”=&number’) x from user_tab_columns where data_type=’NUMBER’) where length(x)!=0;
Enter value for number: 3000
TABLE_NAME COLUMN_NAME
———- ——————————
EMP        SAL
BONUS      COMM
SALGRADE   HISAL