Tag Archives: star schema

On star transformation

How to configure a schema to support a star transformation query?

This is one of the topic I am currently preparing for the OCM DBA 11g upgrade exam for 9i OCM’s.

Let’s create a star schema with a fact table T1 and two dimension tables T2 and T3 :


> create table t2(y number constraint t2_pk primary key, yy varchar2(50))
table t2 created.
> create table t3(z number constraint t3_pk primary key, zz varchar2(50))
table t3 created.
> create table t1(x number constraint t1_pk primary key, 
      y number constraint t1_t2_fk references t2, 
      z number constraint t1_t3_fk references t3)
table t1 created.

According to the Oracle Database Data Warehousing Guide, chapter 20 Schema Modeling Techniques
A prerequisite of the star transformation is that there be a single-column bitmap index on every join column of the fact table. These join columns include all foreign key columns.


> create bitmap index bi1 on t1(y)
bitmap index bi1 created.
> create bitmap index bi2 on t1(z)
bitmap index bi2 created.
> exec dbms_stats.gather_table_stats(user,'t1',cascade=>true)
anonymous block completed
> exec dbms_stats.gather_table_stats(user,'t2',cascade=>true)
anonymous block completed
> exec dbms_stats.gather_table_stats(user,'t3',cascade=>true)
anonymous block completed

Obviously the star_transformation_enabled must be set to true. It was known as buggy in the first releases, I hope most bugs have been fixed in the meantime.


> alter session set star_transformation_enabled=true
session set altered

Let’s try the transformation with a query similar to the one found in the doc :

> set autotrace on exp
Autotrace Enabled
Displays the execution plan only.
> select count(*) from t1 natural join t2 natural join t3 
      where yy='one' and zz in ('two','three')
COUNT(*)               
---------------------- 
0                      

Plan hash value: 3024982001
 
------------------------------------------------------------------------------------------------------
| Id  | Operation                         | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                  |                  |     1 |   106 |     1 (100)| 00:00:01 |
|   1 |  SORT AGGREGATE                   |                  |     1 |   106 |            |          |
|   2 |   NESTED LOOPS                    |                  |       |       |            |          |
|   3 |    NESTED LOOPS                   |                  |     1 |   106 |     1 (100)| 00:00:01 |
|   4 |     NESTED LOOPS                  |                  |     1 |    66 |     1 (100)| 00:00:01 |
|   5 |      VIEW                         | index$_join$_001 |     1 |    26 |     1 (100)| 00:00:01 |
|*  6 |       HASH JOIN                   |                  |       |       |            |          |
|   7 |        BITMAP CONVERSION TO ROWIDS|                  |     1 |    26 |     0   (0)| 00:00:01 |
|   8 |         BITMAP INDEX FULL SCAN    | BI1              |       |       |            |          |
|   9 |        BITMAP CONVERSION TO ROWIDS|                  |     1 |    26 |     0   (0)| 00:00:01 |
|  10 |         BITMAP INDEX FULL SCAN    | BI2              |       |       |            |          |
|* 11 |      TABLE ACCESS BY INDEX ROWID  | T2               |     1 |    40 |     0   (0)| 00:00:01 |
|* 12 |       INDEX UNIQUE SCAN           | T2_PK            |     1 |       |     0   (0)| 00:00:01 |
|* 13 |     INDEX UNIQUE SCAN             | T3_PK            |     1 |       |     0   (0)| 00:00:01 |
|* 14 |    TABLE ACCESS BY INDEX ROWID    | T3               |     1 |    40 |     0   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   6 - access(ROWID=ROWID)
  11 - filter("T2"."YY"='one')
  12 - access("T1"."Y"="T2"."Y")
  13 - access("T1"."Z"="T3"."Z")
  14 - filter("T3"."ZZ"='three' OR "T3"."ZZ"='two')

> set autotrace off
Autotrace Disabled

Something is missing. The star transformation is not used. What could it be???

Data !

We need to load trucks of data in the T1 table and a few rows in the T2 and T3 tables

  
> insert into t2 select rownum, 
      to_char(to_timestamp(to_char(rownum/1e9,'.000000000'),'.ff'),'ffsp') 
      from dual connect by level<11
10 rows inserted.
> insert into t3 select rownum, 
      to_char(to_timestamp(to_char(rownum/1e9,'.000000000'),'.ff'),'ffsp') 
      from dual connect by level<11
10 rows inserted.
> insert into t1 select rownum, mod(rownum,7)+1, mod(rownum,8)+1 
      from dual connect by level<100001
100,000 rows inserted.
> commit
commited.
> exec dbms_stats.gather_table_stats(user,'t1',cascade=>true)
anonymous block completed
> exec dbms_stats.gather_table_stats(user,'t2',cascade=>true)
anonymous block completed
> exec dbms_stats.gather_table_stats(user,'t3',cascade=>true)
anonymous block completed

Let’s try the same query again


> set autotrace on exp
Autotrace Enabled
Displays the execution plan only.
> select count(*) from t1 natural join t2 natural join t3
      where yy='one' and zz in ('two','three')
COUNT(*)               
---------------------- 
3570                   

Plan hash value: 3170767457
 
-------------------------------------------------------------------------------------
| Id  | Operation                    | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |      |     1 |     6 |    12   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE              |      |     1 |     6 |            |          |
|   2 |   BITMAP CONVERSION COUNT    |      |  2857 | 17142 |     6   (0)| 00:00:01 |
|   3 |    BITMAP AND                |      |       |       |            |          |
|   4 |     BITMAP MERGE             |      |       |       |            |          |
|   5 |      BITMAP KEY ITERATION    |      |       |       |            |          |
|*  6 |       TABLE ACCESS FULL      | T2   |     1 |     8 |     3   (0)| 00:00:01 |
|*  7 |       BITMAP INDEX RANGE SCAN| BI1  |       |       |            |          |
|   8 |     BITMAP MERGE             |      |       |       |            |          |
|   9 |      BITMAP KEY ITERATION    |      |       |       |            |          |
|* 10 |       TABLE ACCESS FULL      | T3   |     2 |    16 |     3   (0)| 00:00:01 |
|* 11 |       BITMAP INDEX RANGE SCAN| BI2  |       |       |            |          |
-------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   6 - filter("T2"."YY"='one')
   7 - access("T1"."Y"="T2"."Y")
  10 - filter("T3"."ZZ"='three' OR "T3"."ZZ"='two')
  11 - access("T1"."Z"="T3"."Z")
 
Note
-----
   - star transformation used for this statement

> set autotrace off
Autotrace Disabled

That’s it ! I have 100K rows in my fact table and 10 rows in my dimension tables, that does the trick ! Star transformation needs a lot of rows to work :)