Categories
sql

On star transformation

How to configure a schema to support a star transformation query?

This is one of the topic I am currently preparing for the OCM DBA 11g upgrade exam for 9i OCM’s.

Let’s create a star schema with a fact table T1 and two dimension tables T2 and T3 :

> create table t2(y number constraint t2_pk primary key, yy varchar2(50))
table t2 created.
> create table t3(z number constraint t3_pk primary key, zz varchar2(50))
table t3 created.
> create table t1(x number constraint t1_pk primary key,
y number constraint t1_t2_fk references t2,
z number constraint t1_t3_fk references t3)
table t1 created.

According to the Oracle Database Data Warehousing Guide, chapter 20 Schema Modeling Techniques
A prerequisite of the star transformation is that there be a single-column bitmap index on every join column of the fact table. These join columns include all foreign key columns.


> create bitmap index bi1 on t1(y)
bitmap index bi1 created.
> create bitmap index bi2 on t1(z)
bitmap index bi2 created.
> exec dbms_stats.gather_table_stats(user,'t1',cascade=>true)
anonymous block completed
> exec dbms_stats.gather_table_stats(user,'t2',cascade=>true)
anonymous block completed
> exec dbms_stats.gather_table_stats(user,'t3',cascade=>true)
anonymous block completed

Obviously the star_transformation_enabled must be set to true. It was known as buggy in the first releases, I hope most bugs have been fixed in the meantime.

> alter session set star_transformation_enabled=true
session set altered

Let’s try the transformation with a query similar to the one found in the doc :

> set autotrace on exp
Autotrace Enabled
Displays the execution plan only.
> select count(*) from t1 natural join t2 natural join t3
where yy='one' and zz in ('two','three')
COUNT(*)
----------------------
0

Plan hash value: 3024982001

------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 106 | 1 (100)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 106 | | |
| 2 | NESTED LOOPS | | | | | |
| 3 | NESTED LOOPS | | 1 | 106 | 1 (100)| 00:00:01 |
| 4 | NESTED LOOPS | | 1 | 66 | 1 (100)| 00:00:01 |
| 5 | VIEW | index$_join$_001 | 1 | 26 | 1 (100)| 00:00:01 |
|* 6 | HASH JOIN | | | | | |
| 7 | BITMAP CONVERSION TO ROWIDS| | 1 | 26 | 0 (0)| 00:00:01 |
| 8 | BITMAP INDEX FULL SCAN | BI1 | | | | |
| 9 | BITMAP CONVERSION TO ROWIDS| | 1 | 26 | 0 (0)| 00:00:01 |
| 10 | BITMAP INDEX FULL SCAN | BI2 | | | | |
|* 11 | TABLE ACCESS BY INDEX ROWID | T2 | 1 | 40 | 0 (0)| 00:00:01 |
|* 12 | INDEX UNIQUE SCAN | T2_PK | 1 | | 0 (0)| 00:00:01 |
|* 13 | INDEX UNIQUE SCAN | T3_PK | 1 | | 0 (0)| 00:00:01 |
|* 14 | TABLE ACCESS BY INDEX ROWID | T3 | 1 | 40 | 0 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

6 - access(ROWID=ROWID)
11 - filter("T2"."YY"='one')
12 - access("T1"."Y"="T2"."Y")
13 - access("T1"."Z"="T3"."Z")
14 - filter("T3"."ZZ"='three' OR "T3"."ZZ"='two')

> set autotrace off
Autotrace Disabled

Something is missing. The star transformation is not used. What could it be???

Data !

We need to load trucks of data in the T1 table and a few rows in the T2 and T3 tables

> insert into t2 select rownum,
to_char(to_timestamp(to_char(rownum/1e9,'.000000000'),'.ff'),'ffsp')
from dual connect by level<11 10 rows inserted. > insert into t3 select rownum,
to_char(to_timestamp(to_char(rownum/1e9,'.000000000'),'.ff'),'ffsp')
from dual connect by level<11 10 rows inserted. > insert into t1 select rownum, mod(rownum,7)+1, mod(rownum,8)+1
from dual connect by level<100001 100,000 rows inserted. > commit
commited.
> exec dbms_stats.gather_table_stats(user,'t1',cascade=>true)
anonymous block completed
> exec dbms_stats.gather_table_stats(user,'t2',cascade=>true)
anonymous block completed
> exec dbms_stats.gather_table_stats(user,'t3',cascade=>true)
anonymous block completed

Let’s try the same query again

> set autotrace on exp
Autotrace Enabled
Displays the execution plan only.
> select count(*) from t1 natural join t2 natural join t3
where yy='one' and zz in ('two','three')
COUNT(*)
----------------------
3570

Plan hash value: 3170767457

-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 6 | 12 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 6 | | |
| 2 | BITMAP CONVERSION COUNT | | 2857 | 17142 | 6 (0)| 00:00:01 |
| 3 | BITMAP AND | | | | | |
| 4 | BITMAP MERGE | | | | | |
| 5 | BITMAP KEY ITERATION | | | | | |
|* 6 | TABLE ACCESS FULL | T2 | 1 | 8 | 3 (0)| 00:00:01 |
|* 7 | BITMAP INDEX RANGE SCAN| BI1 | | | | |
| 8 | BITMAP MERGE | | | | | |
| 9 | BITMAP KEY ITERATION | | | | | |
|* 10 | TABLE ACCESS FULL | T3 | 2 | 16 | 3 (0)| 00:00:01 |
|* 11 | BITMAP INDEX RANGE SCAN| BI2 | | | | |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

6 - filter("T2"."YY"='one')
7 - access("T1"."Y"="T2"."Y")
10 - filter("T3"."ZZ"='three' OR "T3"."ZZ"='two')
11 - access("T1"."Z"="T3"."Z")

Note
-----
- star transformation used for this statement

> set autotrace off
Autotrace Disabled

That’s it ! I have 100K rows in my fact table and 10 rows in my dimension tables, that does the trick ! Star transformation needs a lot of rows to work 🙂