Irrecoverable part III : a fix

After part I:the problem, Part II: two reports here is part III.

We have backed up archivelog during a full and deleted them before the full completed.


RMAN> list backup of archivelog all;

List of Backup Sets
===================

BS Size   Type Elapsed  Completion
-- ------ ---- -------- ----------
15 4.00K  DISK 00:00:00   13:31:08
    BP Key: 15   Status: AVAILABLE
    Piece Name: /FULL/0fq7gc0s_1_1

  List of Archived Logs in backup set 15
  Seq LowSCN  LowTime  NextSCN NextTime
  --- ------- -------- ------- --------
  15  355533  13:29:55 355777  13:31:08

BS Size   Type Elapsed  Completion
-- ------ ---- -------- ----------
25 4.00K  DISK 00:00:00   13:31:26
    BP Key: 25   Status: AVAILABLE
    Piece Name: /ARCH/0pq7gc1e_1_1

  List of Archived Logs in backup set 25
  Seq LowSCN  LowTime  NextSCN NextTime
  --- ------- -------- ------- --------
  15  355533  13:29:55 355777  13:31:08

BS Size   Type Elapsed  Completion
-- ------ ---- -------- ----------
26 3.00K  DISK 00:00:00 13:31:26
  BP Key: 26   Status: AVAILABLE
  Piece Name: /ARCH/0qq7gc1e_1_1

  List of Archived Logs in backup set 26
  Seq LowSCN  LowTime  NextSCN NextTime
  --- ------- -------- ------- --------
  16  355777  13:31:08 355827  13:31:22

BS Size   Type Elapsed  Completion
-- ------ ---- -------- ----------
28 2.50K  DISK 00:00:00 13:31:28
    BP Key: 28   Status: AVAILABLE
    Piece Name: /FULL/0sq7gc1g_1_1

  List of Archived Logs in backup set 28
  Seq LowSCN  LowTime  NextSCN NextTime
  --- ------- -------- ------- --------
  17  355827  13:31:22 355863  13:31:28

Sequence 16 is missing in /FULL/ and is available in /ARCH/

Ok, let’s copy the missing pieces from ARCH to FULL


$ cd /
$ cp -p ARCH/* FULL/

Now let’s try. I save my directory on an offline location (tape/usb/cdrom) and dropped my database including backups.


cd /
tar cvf /tmp/FULL.tar FULL

Let’s drop (it’s just a test)


RMAN> startup force mount dba 
RMAN> drop database including backups;
database dropped

Ok let’s get this USB stick back from the moon to my datacenter and see if I can restore …


cd /
tar xvf /tmp/FULL.tar 

Let’s do this !


RMAN> startup force nomount
RMAN> restore controlfile from '/FULL/c-2414975447-20150521-01';
RMAN> alter database mount;

Now I want RMAN to look for other files in FULL and also marked expired stuff. The start with does the magic.

RMAN> catalog start with '/FULL/' noprompt;
RMAN> crosscheck backup;
RMAN> crosscheck copy;
RMAN> crosscheck archivelog all;
RMAN> restore database;
RMAN> recover database;
RMAN> alter database open resetlogs;

As @dbastorage said on twitter:
“if it was easy, we would not need DBAs! ”

Irrecoverable full backup part II : reporting

After my post Can you restore from a full online backup ?, I needed to come up with a report.

Assuming that each backup goes in a different directory, I just wrote two reports.

  1. Report gaps in v$backup_redolog (or rc_backup_redolog if you use the catalog)
    
    DIR     FIRST_CHANGE# NEXT_CHANGE#
    ------- ------------- ------------
    /bck01/        284891       285140
    /bck01/        285140       285178
    
    /bck02/        284891       285140
    === GAP ===
    /bck02/        285178       285245 
    
    /bck03/        285178       285245
    /bck03/        285245       286931
    /bck03/        286931       287803
    /bck03/        287803       288148
    

    This could be done with analytics, by checking where the last next_change is not the current first_change, within a directory

    
    SELECT dir, 
      LAG missing_from_change#, 
      first_change# missing_to_change#
    FROM (
      SELECT REGEXP_REPLACE (handle, '[^/\]+$') dir,
        first_change#,
        next_change#,
        LAG(next_change#) OVER (
          PARTITION BY REGEXP_REPLACE (handle, '[^/\]+$')
          ORDER BY first_change#
        ) LAG
      FROM v$backup_piece p
      JOIN v$backup_redolog l 
        USING (set_stamp, set_count))
    WHERE LAG != first_change#;
    
    DIR     MISSING_FROM_CHANGE# MISSING_TO_CHANGE#
    ------- -------------------- ------------------
    /bck02/               285140             285178
    
  2. Reports directories where archivelogs don’t include changes (backup redolog) from the earliest to the latest checkpoint (backup datafile)
    
    SELECT 
      REGEXP_REPLACE (handle, '[^/\]+$') dir,
      MIN (checkpoint_change#),
      MAX (checkpoint_change#),
      MIN (first_change#),
      MAX (next_change#)
    FROM v$backup_piece p
      LEFT JOIN v$backup_datafile f 
        USING (set_stamp, set_count)
      LEFT JOIN v$backup_redolog l 
        USING (set_stamp, set_count)
    WHERE handle IS NOT NULL
    HAVING 
      MIN (checkpoint_change#) < MIN (first_change#)
      OR 
      MAX (checkpoint_change#) > MAX (next_change#)
    GROUP BY REGEXP_REPLACE (handle, '[^/\]+$');
    
    DIR     MINCHECKPT MAXCHECKPT   MINFIRST    MAXNEXT
    ------- ---------- ---------- ---------- ----------
    /bck04/     954292     954299     959487    1145473
    

    the archives for the changes from 954292 to 959487 are missing.

If some archive backups are missing in one directory, it does not mean the database is irrecoverable, the archive backups could be in another directory. But it means that single directory would no longer permit you to restore or duplicate.

Another approach with RESTORE PREVIEW was provided by Franck in my previous post : List all RMAN backups that are needed to recover.

Usual disclaimer: there are plenty of other irrecoverabilty causes from hardware defect to backup “optimization” that are beyond the scope of this post.

Can you restore from a full online backup ?

The question is not HOW TO DO IT but WHETHER YOU CAN DO IT !

A typical backup script would contains something like BACKUP DATABASE PLUS ARCHIVELOG:


backup database format
'/u99/backup/DB01/20150518/full_0_%d_s%s_p%p'
plus archivelog format
'/u99/backup/DB01/20150518/arc_%d_s%s_p%p';

Starting backup at 2015-05-18_18:27:55
current log archived
input archived log thread=1 sequence=469
...
piece handle=
/u99/backup/DB01/20150518/arc_DB01_s86_p1
Finished backup at 2015-05-18_18:27:58

Starting backup at 2015-05-18_18:27:58
input datafile file number=00002 name=
/u02/oradata/DB01/undots01.dbf
...
including current control file in backup set
including current SPFILE in backup set
piece handle=
/u99/backup/DB01/20150518/full_0_DB01_s88_p1
Finished backup at 2015-05-18_18:28:16

Starting backup at 2015-05-18_18:28:16
current log archived
input archived log thread=1 sequence=20
piece handle=
/u99/backup/DB01/20150518/arc_DB01_s89_p1
Finished backup at 2015-05-18_18:28:17

This sounds pretty safe, but what happened if you keep this full backup for ever ? do you have all what you need in it to restore ?

It depends. Chance exists that you can restore. To annoy the paranoiacs, here is a counter example.

  1. you start your full backup at 6pm.
    backuping sequence 21,22,23,24
    backup datafiles…

  2. at 7 pm one cronjob issue backup archivelog all delete input;
    backuping and deleting sequence 21,22,23,24,25,26,27,28,29,30

  3. at 8pm your backup is about to finish
    backuping sequence 31

Well, where is my sequence 27 ?

Let’s try


rm /tmp/arch_DB01*
startup force mount;
crosscheck backup;restore database;


Starting restore at 2015-05-18_18:47:45
channel ORA_DISK_1: restore complete, elapsed time: 00:02:05
Finished restore at 2015-05-18_18:49:51

Ok, the restore was fine. Now what?


RMAN&gt; recover database;

Starting recover at 2015-05-18_18:50:35
using channel ORA_DISK_1

starting media recovery

RMAN-00571: =================================================
RMAN-00569: ========== ERROR MESSAGE STACK FOLLOWS ==========
RMAN-00571: =================================================
RMAN-03002: failure of recover command at 05/18/2015 18:50:36
RMAN-06053: unable to perform media recovery because of missing log
RMAN-06025: no backup of archived log for thread 1 with sequence 30
RMAN-06025: no backup of archived log for thread 1 with sequence 29
RMAN-06025: no backup of archived log for thread 1 with sequence 27

RMAN&gt; alter database open;

RMAN-00571: =================================================
RMAN-00569: ========== ERROR MESSAGE STACK FOLLOWS ==========
RMAN-00571: =================================================
RMAN-03002: failure of alter db command at 05/18/2015 18:51:29
ORA-01113: file 1 needs media recovery
ORA-01110: data file 1: '/u02/oradata/DB01/system01.dbf'

You have lost your database. Even if you did a full online backup plus archivelog, it did not contain all the archivelogs and RMAN did not tell you.

There is more than one way to reduce this annoyance :

1) don’t backup archivelogs during full backup
2) don’t delete archivelog backups done during a full backup
3) report any archivelog backups that run during a full. Either by looking in the RMAN dictionary or in your house logfiles
4) do offline backup for long term archives
5) don’t trust RMAN

And as well :
Test your backups 😀

bypass ora-20

When you really need to run one script, at all cost, an annoying error is ORA-00020: maximum number of processes (40) exceeded, which can even occurs as sysdba.

Test case (21 is a not something to do in real life):


SQL> alter system set processes=21 scope=spfile;
System altered.
SQL> startup force quiet
ORACLE instance started.
Database mounted.
Database opened.
SQL> quit

From now on, sqlplus as sysdba is impossible.

$ sqlplus -s -L / as sysdba
ERROR:
ORA-00020: maximum number of processes (40) exceeded
SP2-0751: Unable to connect to Oracle.  Exiting SQL*Plus

Okay, if you really really need to run one script, you could connect with sqlplus -prelim and restart the database.


if :|sqlplus / as sysdba|grep ORA-00020
then
  echo shu abort|sqlplus -prelim / as sysdba
  echo startup quiet|sqlplus / as sysdba
fi

If ORA-20 is detected, then the database will be restarted.

switch user in Oracle

Almost a decade ago I wrote about su in sqlplus. This 10gR2 “new” feature allows delegation à la sudo.

By checking the DBA_USERS in 12c I found PROXY_ONLY_CONNECT. According to Miguel Anjo, there is a secret syntax for allowing only the proxy user.


SQL> ALTER USER app_user PROXY ONLY CONNECT;
SQL> CONNECT app_user/xyz
ERROR:ORA-28058: login is allowed only through a proxy

Unusable index

After table maintenance, like move or split partition, underlying indexes are marked as unusable.

This boils down to segment reorganisation, not dictionary change.

For instance :


CREATE TABLE t(x NUMBER)
PARTITION BY RANGE(x)
(PARTITION p1 VALUES LESS THAN (MAXVALUE));

CREATE INDEX i ON t (x);


INSERT INTO t VALUES (1);

ALTER TABLE T SPLIT PARTITION p1 AT 
  (100) INTO (PARTITION p2, PARTITION p3);

SELECT index_name, status 
FROM user_indexes 
WHERE index_name = 'I';

INDEX_NAME STATUS  
---------- --------
I          VALID

The partition p1 is splitted into two p2 and p3, all rows from p1 belongs to p2, p3 is created empty, no data manipulation, the index remains valid


delete from t; 
INSERT INTO t VALUES (250);

ALTER TABLE T SPLIT PARTITION p3 AT 
  (200) INTO (PARTITION p4, PARTITION p5);

SELECT index_name, status 
FROM user_indexes 
WHERE index_name = 'I';

INDEX_NAME STATUS  
---------- --------
I          VALID

Same here, all rows from p3 moved to p5, p4 is created empty, no problem


delete from t; 
INSERT INTO t VALUES (250);
INSERT INTO t VALUES (350);

ALTER TABLE T SPLIT PARTITION p5 AT 
  (300) INTO (PARTITION p6, PARTITION p7);

SELECT index_name, status 
FROM user_indexes 
WHERE index_name = 'I';
INDEX_NAME STATUS  
---------- --------
I          UNUSABLE

One row goes to p6, one row to p7, the index is invalidated.

To avoid this, use ALTER TABLE T SPLIT PARTITION p5 AT (300) INTO (PARTITION p6, PARTITION p7) UPDATE INDEXES;

What are the consequences?


INSERT INTO t VALUES (320);
1 row inserted

It does not prevent DML in this case.


select * from t where x=320;

         X
----------
       320

Execution Plan
---------------------------------------------------
 0   SELECT STATEMENT Optimizer Mode=ALL_ROWS
 1 0   PARTITION RANGE SINGLE
 2 1     TABLE ACCESS FULL SCOTT.T

The row is retrieved, but the unusable index is not used.

In some case, DML may be prevented.


alter index i rebuild;
alter table t modify (x primary key using index);
ALTER TABLE T SPLIT PARTITION p7 AT 
  (330) INTO (PARTITION p8, PARTITION p9) ;

insert into t values (450);
ORA-01502: index 'SCOTT.I' or partition of such index is in unusable state

The index can no longer be used for constraint enforcement and the INSERT fails.

If the index is partitioned, the index is not marked as unusable as a whole, only the affected (sub)partitions are marked.

Check all your indexes status in dba_indexes.status, dba_ind_partitions.status and dba_ind_subpartitions.status.

Rebuild them with alter index i rebuild, alter index i rebuild partition p and alter index i rebuild subpartition sp.

Did you forget to run root.sh?

Not easy to detect, and depending on the product (agent/database), it may have only limited side effects.

Like external jobs not running, operating systems statistics not collected.

But it is not always easy to diagnose.

For instance if you patch from OMS 12cR2 to 12cR3, and you run the root.sh only in 12cR2, they are very few statistics missing (one is the OS_STORAGE_ENTITY).

Running the root.sh doesn’t generate a log file or an entry in the inventory.

To check if it was executed, check what it is supposed to do. It is a bit different in each version. One think it always does is changing the ownership to root and set the sticky bit for a few binaries. For the database, this is done in sub-scripts called rootadd.sh (10g) or rootadd_rdbms.sh (11g/12c).


eval ls -l $(find $ORACLE_HOME -name "rootadd*sh" -exec awk '$1="$CHOWN"&&$2=="root"{print $3}' {} \;|sort -u)

-rwsr-x--- root dba .../product/11.2.0/db_4/bin/extjob
-rwsr-x--- root dba .../product/11.2.0/db_4/bin/jssu
-rws--x--- root dba .../product/11.2.0/db_4/bin/nmb
-rws--x--- root dba .../product/11.2.0/db_4/bin/nmhs
-rws--x--- root dba .../product/11.2.0/db_4/bin/nmo
-rwsr-x--- root dba .../product/11.2.0/db_4/bin/oradism
-rw-r----- root dba ...11.2.0/db_4/rdbms/admin/externaljob.ora

If the ownership is root, you definitely did run the root.sh.

On the 12c agent, there is a FULL_BINARY_LIST variable that point to list of root binaries in sbin


eval $(grep FULL_BINARY_LIST= $AGENT_HOME/root.sh)
cd $AGENT_HOME/../../sbin
ls -l $FULL_BINARY_LIST

-rws--x--- root dba nmb
-rws--x--- root dba nmhs
-rws--x--- root dba nmo

If all files exist and belong root, it looks like you did run the root.sh.

last partition

if you really need to quickly find the latest partition per table, I have written this little gem


WITH FUNCTION d (b BLOB, len number) RETURN DATE IS
  d DATE;
BEGIN
  IF DBMS_LOB.SUBSTR (b, 1, 1) = hextoraw('07') and len=83
  THEN
    DBMS_STATS.convert_raw_value (DBMS_LOB.SUBSTR (b, 12, 2), d);
  ELSE
    d := NULL;
  END IF;
  RETURN d;
END;
SELECT 
  u.name owner,
  o.name table_name,
  max(d (bhiboundval, hiboundlen)) last_partition
FROM sys.tabpart$ tp
  JOIN sys.obj$ o USING (obj#)
  JOIN sys.user$ u ON u.user# = o.owner#
WHERE u.name='SCOTT'
group by u.name, o.name
order by last_partition desc;

It doesn’t cover all partitioning type, but it is pretty fast and simple

Do you have a partition in 2015

You need to check the high_value from dba_tab_partitions.
Or you you could metadata.

With metadata, it is not a long, it either clob or clob-xml.



  SELECT t.table_name,
         MAX (
            TO_DATE (
               REGEXP_SUBSTR (
                  EXTRACT ( (VALUE (x)), 'HIBOUNDVAL').getStringVal (),
                  ' \d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}'),
               'YYYY-MM-DD HH24:MI;:SS'))
            high_value
    FROM user_tables t,
         TABLE (
            XMLSEQUENCE (
               EXTRACT (
                  xmltype (DBMS_METADATA.get_xml ('TABLE', t.table_name)),
                  '//ROWSET/ROW/TABLE_T/PART_OBJ/PART_LIST/PART_LIST_ITEM/HIBOUNDVAL'))) x
   WHERE partitioned = 'YES'
  HAVING MAX (
            TO_DATE (
               REGEXP_SUBSTR (
                  EXTRACT ( (VALUE (x)), 'HIBOUNDVAL').getStringVal (),
                  ' \d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}'),
               'YYYY-MM-DD HH24:MI;:SS')) < DATE '2015-01-02'
GROUP BY t.table_name
ORDER BY 1;

TABLE_NAME HIGH_VALUE                    
---------- ------------------------------
T1         2015-01-01                    
T11        2000-01-01                    
T20        1436-03-09                    
T6         2000-01-01                    
T7         2014-12-21                    
T8         2015-01-01

Ok, out of xml, I get all HIBOUNDVAL and hazardously try some regular expression to find a pattern.

It already returned most of my table that could have caused me New Eve headache. But partitioned is not that simple. Line 3 for instance is not in a gregorian format (it’s a bug). And it does not cover index partition, subpartitions, interval partitions, neither partition with more than one date column as key.

This sounds a lot, and there is only one HIGH_VALUE for multiple key, and the High_value is something like 0,TO_DATE(' 3543-01-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN'),MAXVALUE

Okay, if you forget about non-gregorian calendars, and ignore about MAXVALUE, we could use something like


curid := DBMS_SQL.open_cursor;
txt :=
  'select '
  || REPLACE (g.high_value, 'MAXVALUE', 'NULL')
  || ' from dual';
DBMS_SQL.parse (curid, txt, DBMS_SQL.NATIVE);

Then from dba_PART_KEY_COLUMNS (pkc) or dba_SUBPART_KEY_COLUMNS, joined with dba_tab_columns, we could retrieve the datatype (date and timestamp with or without time zone) and the pkc.column_position.


DBMS_SQL.DEFINE_COLUMN (curid,
  f.column_position,
  'date',
  12);

Casting to date should be fine.

Then we fetch and retrieve the Nth column


IF DBMS_SQL.FETCH_ROWS (curid) > 0
THEN
  DBMS_SQL.COLUMN_VALUE (curid, f.column_position, namevar);
END IF;

If you read me that far, you probably can write the rest of the code for yourself.

Could be useful to do this before end-of-year 😉

poor man ActiveDirectory password checker

To have the same users in multiple databases and no single sign on is quite a nightmare for password expiration, synchronisation and validation.

You probably were discouraged by the long long route to kerberos, where the 11.2.0.2 bugs are fixed in 11.2.0.4, the 12.1 bugs are fixed in 12.2. And lot’s of system changes that won’t be welcome by your sysadmins / winadmins.

Okay, to partly cover the password expiration issue, you could check in a profile function that the password is the one from AD.

Firstly, without SSL


CREATE OR REPLACE FUNCTION pw_function_AD
(username varchar2,
 password varchar2,
 old_password varchar2)
RETURN boolean IS
  sess raw(32);
  rc number;
BEGIN
  sess := DBMS_LDAP.init(
    'example.com',dbms_ldap.PORT);
  rc := DBMS_LDAP.simple_bind_s(
    sess, username||'@example.com', 
    password);
  rc := DBMS_LDAP.unbind_s(sess);
  RETURN(TRUE);
EXCEPTION
  WHEN OTHERS THEN
    rc := DBMS_LDAP.unbind_s(sess);
    raise;
END;
/
GRANT EXECUTE ON pw_function_ad TO PUBLIC;
CREATE PROFILE AD LIMIT 
  PASSWORD_VERIFY_FUNCTION pw_function_AD;
ALTER PROFILE AD LIMIT 
  PASSWORD_LIFE_TIME 30;
ALTER PROFILE AD LIMIT 
  PASSWORD_REUSE_MAX UNLIMITED;

alter user lsc profile AD;

When the password expires, the user must change it to its AD Password.

If I try with a dummy password, the profile will reject this


SQL> conn lsc/pw1
ERROR:
ORA-28001: the password has expired

Changing password for lsc
New password:anypassword
Retype new password:anypassword
ERROR:
ORA-28003: password verification for 
  the specified password failed
ORA-31202: DBMS_LDAP: LDAP client/server 
  error: Invalid credentials. 
  80090308: LdapErr: DSID-0C0903A9, 
  comment: AcceptSecurityContext error, 
    data 52e, v1db1
Password unchanged
Warning: You are no longer connected to ORACLE.

I need to enter my Windows password


SQL> conn lsc/pw1
ERROR:
ORA-28001: the password has expired

Changing password for lsc
New password: mywindowspassword
Retype new password: mywindowspassword
Password changed
Connected.

Secondly, with SSL.

Maybe simple bind without SSL is not possible (check http://support.microsoft.com/kb/935834). And for sure it is better to not send unencrypted plain text password over the network.

Create a wallet with password with the ROOT Certification Authority that signed your AD. You probably could download this in your trusted root certification authorities in Internet Explorer.

Internet Explorer – Tools – Internet Options – Content – Certificates – Trusted root.

Then you create a ewallet.p12 with orapki. No need for user certificate and no need for single-sign-on. Only import the trusted root (and intermediaries if applicable).

Here is the modified code


CREATE OR REPLACE FUNCTION pw_function_AD
(username varchar2,
 password varchar2,
 old_password varchar2)
RETURN boolean IS
  sess raw(32);
  rc number;
BEGIN
  sess := DBMS_LDAP.init(
    'example.com',dbms_ldap.SSL_PORT);
  rc := DBMS_LDAP.open_ssl(
    sess, 'file:/etc/wallet/MSAD', 
    'welcome1', 2);
  rc := DBMS_LDAP.simple_bind_s(
    sess, username||'@example.com', 
    password);
  rc := DBMS_LDAP.unbind_s(sess);
  RETURN(TRUE);
EXCEPTION
  WHEN OTHERS THEN
    rc := DBMS_LDAP.unbind_s(sess);
    raise;
END;
/

If you get SSL Handshake, be prepared, it could be anything! Check your wallet, your certificate, your permission, your wallet password.

One step further could be to expire users as soon as they change their password in AD or when they expire there.

For instance with powershell goodies for active directory


PS> (Get-ADuser lsc -properties PasswordLastSet).PasswordLastSet

Montag, 6. Oktober 2014 08:18:23

PS> (Get-ADuser king -properties AccountExpirationDate).AccountExpirationDate

Mittwoch, 16. Juli 2014 06:00:00

And in the database


SQL> SELECT ptime FROM sys.user$ 
  WHERE name ='LSC';

PTIME
-------------------
2014-11-10_10:33:08

If PTIME is less than PasswordLastSet or if AccountExpirationDate is not null, expire the account.

In conclusion : if you do not want to use Kerberos, nor Oracle “OctetString” Virtual Directory ovid nor Oracle Internet directory oid, this workaround may help to increase your security by addressing the “shared” and “expired” accounts problematic

There an additional hidden benefit. You could set up a self-service password reset function and send a generated expired password per mail, that the user won’t be able to change without its AD password

import into UTF8 database

A common error when you import single-byte characters (e.g. iso8859p1 or mswin1252) into multi-bytes databases (e.g. utf8) is ORA-12899: value too large for column.

The root cause is the default semantics in a database being BYTE


SQL> select VALUE, ISDEFAULT 
  from v$parameter 
  where NAME='nls_length_semantics'
VALUE   ISDEFAULT
------- ---------
BYTE    TRUE

It means, one char equals one byte. But after conversion, one char is larger than one byte and does not fit any longer.

single-byte


SQL> select VALUE 
  from nls_database_parameters 
  where parameter='NLS_CHARACTERSET';
VALUE
-------------
WE8MSWIN1252
SQL> create table t(x char(1));
Table created.
SQL> insert into t values ('é');
1 row created.
SQL> commit;
Commit complete.
$ expdp scott/tiger dumpfile=t.dmp tables=t
. . exported "SCOTT"."T"            1 rows

multi-byte


SQL> select VALUE 
  from nls_database_parameters 
  where parameter='NLS_CHARACTERSET';
VALUE
-----------
UTF8
$ impdp scott/tiger dumpfile=t.dmp
Processing object type TABLE_EXPORT/TABLE/TABLE
Processing object type 
  TABLE_EXPORT/TABLE/TABLE_DATA
ORA-02374: conversion error loading table 
  "SCOTT"."T"
ORA-12899: value too large for column X 
  (actual: 2, maximum: 1)
ORA-02372: data for row: X : 0X'E9'
. . imported "SCOTT"."T"            0 out of 1 rows

How do I import my data?

1) import the metadata


$ impdp scott/tiger dumpfile=t.dmp content=metadata_only
Processing object type TABLE_EXPORT/TABLE/TABLE

2) change the char_used of the column(s) from (B)yte to (C)har

SQL> select 
  column_name, char_used, data_length, data_type 
  from user_tab_columns 
  where table_name='T' and char_used='B';
COLUMN_NAME  C DATA_LENGTH DATA_TYPE
------------ - ----------- ---------
X            B           1 CHAR
SQL> alter table t modify x char(1 char);
Table altered.

3) import the data

$ impdp scott/tiger dumpfile=t.dmp content=data_only
Processing object type 
  TABLE_EXPORT/TABLE/TABLE_DATA
. . imported "SCOTT"."T"            1 rows

4) check

SQL> select x, length(x), lengthb(x) from t;
X  LENGTH(X) LENGTHB(X)
- ---------- ----------
é          1          2

My column has now a length of one char and two bytes.

ssl version

I wrote about ssl version in jdbc thin yesterday

The default version also no longer works for the thick client with 12c client and 11g Server.

With 11gR2 :


C:> tnsping (DESCRIPTION=(ADDRESS=(PROTOCOL=TCPS)(HOST=SRV01)(PORT=1521)))
TNS Ping Utility for 64-bit Windows: Version 11.2.0.4.0
OK (100 msec)

with 12cR1 :


C:> tnsping (DESCRIPTION=(ADDRESS=(PROTOCOL=TCPS)(HOST=SRV01)(PORT=1521)))
TNS Ping Utility for 64-bit Windows: Version 12.1.0.1.0
TNS-12560: TNS:protocol adapter error

in trace file I see


ntzgsvp: no SSL version specified - using default version 0
ntzdosecneg: SSL handshake failed with error 29048.
ntzCreateConnection: returning NZ error 29048 in result structure
ntzCreateConnection: failed with error 542
nserror: nsres: id=0, op=65, ns=12560, ns2=0; nt[0]=29048, nt[1]=542, nt[2]=0; ora[0]=29048, ora[1]=0, ora[2]=0

I could not see this as a documented change yet, but if you force ssl_version to be 3.0, both client versions works


C:> tnsping (DESCRIPTION=(ADDRESS=(PROTOCOL=TCPS)(HOST=SRV01)(PORT=1521)))
TNS Ping Utility for 64-bit Windows: Version 12.1.0.1.0
OK (80 msec)
C:> find "version" tnsping.trc sqlnet.ora

---------- TNSPING.TRC
nlstddp_dump_ptable:   ssl_version = 3.0
ntzGetStringParameter: found value for "ssl_version" configuration parameter: "3.0"

---------- SQLNET.ORA
ssl_version=3.0

check if using tcps part II

in your current session, as written there, check sys_context('USERENV', 'NETWORK_PROTOCOL')

in another session, you could grab some hints out of the network service banner. Do the maths, when it is not-not using ssl, it probably is…


select sid,program,
  case when program not like 'ora___@% (P%)' then
  (select max(case
when NETWORK_SERVICE_BANNER like '%TCP/IP%' 
      then 'TCP'
when NETWORK_SERVICE_BANNER like '%Bequeath%' 
      then 'BEQUEATH'
when NETWORK_SERVICE_BANNER like '%IPC%' 
      then 'IPC'
when NETWORK_SERVICE_BANNER like '%SDP%' 
      then 'SDP'
when NETWORK_SERVICE_BANNER like '%NAMED P%' 
      then 'Named pipe'
when NETWORK_SERVICE_BANNER is null 
      then 'TCPS' end)
    from V$SESSION_CONNECT_INFO i 
    where i.sid=s.sid) end protocol
  from v$session s;

       SID PROGRAM         PROTOCOL
---------- --------------- --------
       415 sqlplus(TNS V1- BEQUEATH
       396 sqlplus(TNS V1- IPC     
         6 Toad            TCP     
         9 Toad            TCPS    
         1 oracle(DIAG)            
       403 Toad            TCP     

disable commit in procedure

There is an obscure syntax that prevents a procedure from issuing a commit


alter session DISABLE COMMIT IN PROCEDURE;

According to the doc, it prevents procedure from committing your data

Test case


SQL> alter session DISABLE COMMIT IN PROCEDURE
Session altered.
SQL> create table t(x number)
Table created.
SQL> create or replace procedure p is 
begin
  commit;
end;
Procedure created.
SQL> insert into t values (1)
1 row created.
SQL> exec p
BEGIN p; END;
Error at line 17
ORA-00034: cannot COMMIT in current PL/SQL session
ORA-06512: at "SCOTT.P", line 3
ORA-06512: at line 1

But some sys procedures may bypass this restriction


SQL> exec dbms_stats.gather_table_stats(user, 'T')
 PL/SQL procedure successfully completed.
SQL> rollback
Rollback complete.
SQL> select * from t

         X
----------
         1

The row was silently committed.

The long long route to Kerberos

If you want to single-sign-on to your database with your Windows credentials, be aware, it is hard! But the benefit is quite valuable, no more saved password on the client, central password management and user expiration, compliance to the security guidelines, and at no extra cost

Landscape for my setup

  • One PC with Windows (PC01.EXAMPLE.COM)
  • One DB Server with Unix (DBSRV01.EXAMPLE.COM)
  • One Microsoft Active Directory Server (MSAD01.EXAMPLE.COM)

Username

  • user01

Tools for troubleshooting

My Software

  • PC : Oracle Client 11.2.0.3
  • Unix : Oracle Server 11.2.0.4
  • On AD : MSAD 2008

There are a lot of buggy releases (it makes me think Oracle does not test Kerberos properly)

Some hits :
11.2.0.2 : Bug 12635212 : TCP/88 is not working.
12.1.0.1 : Bug 17890382 : ZTK return value: 6

Also your PC must be using Kerberos (which is the case if you login to your Active Directory). The DB server needs some client libraries (krb5.client.rte on AIX).

System changes:

  • PC : edit etc\services
  • 
    C:\> find " 88" %SystemRoot%\system32\drivers\etc\services
    
    ---------- C:\WINDOWS\SYSTEM32\DRIVERS\ETC\SERVICES
    kerberos   88/tcp    kerberos5 krb5 kerberos-sec  #Kerberos
    kerberos   88/udp    kerberos5 krb5 kerberos-sec  #Kerberos
    

  • Unix : edit /etc/services
  • 
    $ grep -w 88 /etc/services
    kerberos  88/tcp  kerberos5 krb5  # Kerberos
    kerberos  88/udp  kerberos5 krb5  # Kerberos
    

  • On AD : disable pre-authentication
    this option has to be set for every user, under user -> user01 -> Properties -> Account -> Account options -> Select “Do not require Kerberos preauthentication”

Those are quite painful. There is a bug 2458563 fixed in 8.1.7.4.99 (whatever it means) that should have addressed pre-authentication. still required on 11.2.0.4 apparently no longer needed with a 11.2.0.4 client
Editing etc/services to add the “kerberos5″ string means you need admin rights on Windows and root on Unix.

Okay, now you need to create the config files. You probably should use Kerberos v5 MIT.

Kerberos5 was released in 1993, not sure why you want to use something older than this… Okay, for kerberos4, released in the 80’s, you would need on the PC and on the DB Server something like


EXAMPLE.COM
EXAMPLE.COM MSAD01.EXAMPLE.COM admin server

Otherwise you need to specify : sqlnet.kerberos5_conf_mit=true
I have an open SR to support regarding : 12c upgrade guide
The SQLNET.KERBEROS5_CONF_MIT networking parameter is no longer supported in sqlnet.ora

Okay, here the configuration files

krb5.conf on the database server and on the PC


[libdefaults]
default_realm = EXAMPLE.COM

[realms]
EXAMPLE.COM = {
  kdc = MSAD01.EXAMPLE.COM
}

[domain_realm]
.example.com = EXAMPLE.COM
example.com = EXAMPLE.COM

The config file location (kerberos4 or 5) is specified by sqlnet.kerberos5_conf.

There should be a technical account for your db server created on the MSAD that matched your db server.

On Active Directory, you create a user (e.g. : oracle_DBSRV01) who must not change password on first login. Then you extract the keytab with ktpass


ktpass.exe -princ oracle/dbsrv01.example.com@EXAMPLE.COM -mapuser oracle_DBSRV01 -crypto all -pass password -out c:\dbsrv01.keytab

As an Oracle DBA, you will probably ask this to another team who is used to Kerberos.

To verify it, you can list the content of the keytab


$ $ORACLE_HOME/bin/oklist -k dbsrv01.keytab
Kerberos Utilities for IBM/AIX RISC System/6000: Version 11.2.0.4.0 - Production on 16-JAN-2014 12:45:08

Copyright (c) 1996, 2013 Oracle.  All rights reserved.

Service Key Table: dbsrv01.keytab

Ver      Timestamp                    Principal
 4  01-Jan-1970 01:00:00  oracle/dbsrv01.example.com@EXAMPLE.COM

The principal name must match your full qualified host name. You cannot use a DNS alias.

On your PC check for the login name :


PS> $o = New-Object DirectoryServices.DirectorySearcher; 
  $o.Filter = 'servicePrincipalName=oracle/dbsrv01.example.com'; 
  $o.FindOne().properties.samaccountname
oracle_DBSRV01

And you can verify the principal of that user


C:\> setspn -L oracle_DBSRV01
Registered ServicePrincipalNames for CN=oracle_DBSRV01,OU=MiscUsers,DC=example,DC=com
:
        oracle/dbsrv01.example.com

Now you’ve got your keytab, this must be on the DB Server only (and must be readable for oracle). The location is specified by SQLNET.KERBEROS5_KEYTAB.

Next step is the credential cache (CC) parameter. On your PC with the Oracle 11g client, you must set sqlnet.kerberos5_cc_name to OSMSFT://
On the server it is not neeeded. On Oracle 12c client, you must set it MSLSA:, but due to bug 17890382, it is not working yet (metalink comment : We will have to wait […] bugs are under investigation).

But before you start, you may want to test the ticket.

On Unix, you can get the ticket with kinit and check it with klist. You need to have your configuration in /etc/krb5/krb5.conf (OS Dependent). Do not forget to destroy your credential cache with kdestroy / okdstry while testing

For the DB Server


$ /usr/krb5/bin/kinit -k -t dbsrv01.keytab oracle/dbsrv01.example.com@EXAMPLE.COM
$ /usr/krb5/bin/klist
Ticket cache:  FILE:/var/krb5/security/creds/krb5cc_99
Default principal:  oracle/dbsrv01.example.com@EXAMPLE.COM

Valid starting     Expires            Service principal
01/16/14 17:41:26  01/17/14 03:41:26  krbtgt/EXAMPLE.COM@EXAMPLE.COM
        Renew until 01/17/14 17:41:26


$ /usr/krb5/bin/kinit user01@EXAMPLE.COM
Password for user01@EXAMPLE.COM:
Ticket cache:  FILE:/var/krb5/security/creds/krb5cc_99
Default principal:  user01@EXAMPLE.COM

Valid starting     Expires            Service principal
01/16/14 17:35:57  01/17/14 03:35:57  krbtgt/EXAMPLE.COM@EXAMPLE.COM
        Renew until 01/17/14 17:35:57

Now we can test the okinit (oracle kinit) tool to do the same. There are some specific trace options that could be set

Here the complete sqlnet.ora on the server


DIAG_ADR_ENABLED = OFF
TRACE_DIRECTORY_OKINIT = /var/opt/oracle/krb/cc
TRACE_FILE_OKINIT = okinit
TRACE_LEVEL_OKINIT = SUPPORT
SQLNET.AUTHENTICATION_KERBEROS5_SERVICE = oracle
SQLNET.AUTHENTICATION_SERVICES= (BEQ,KERBEROS5)
SQLNET.KERBEROS5_CC_NAME = /var/opt/oracle/krb/cc/krb5cc_99
SQLNET.KERBEROS5_CONF = /var/opt/oracle/krb/krb5.conf
SQLNET.KERBEROS5_KEYTAB = /var/opt/oracle/krb/dbsrv01.keytab
sqlnet.kerberos5_conf_mit=true

Note the authentication service. If kerberos is not working, you may no longer be able to log / as sysdba and also some db links may no longer work.

Also note SQLNET.AUTHENTICATION_KERBEROS5_SERVICE, which is the prefix of your principal, oracle/dbsrv01.example.com@EXAMPLE.COM

Then we use okinit as we did for kinit


$ $ORACLE_HOME/bin/okinit -k -t dbsrv01.keytab oracle/dbsrv01.example.com@EXAMPLE.COM

Kerberos Utilities for IBM/AIX RISC System/6000: Version 11.2.0.4.0 - Production on 16-JAN-2014 17:52:21

Copyright (c) 1996, 2013 Oracle.  All rights reserved.
$ $ORACLE_HOME/bin/oklist

Kerberos Utilities for IBM/AIX RISC System/6000: Version 11.2.0.4.0 - Production on 16-JAN-2014 17:55:27

Copyright (c) 1996, 2013 Oracle.  All rights reserved.

Ticket cache: /var/opt/oracle/krb/cc/krb5cc_99
Default principal: oracle/dbsrv01.example.com@EXAMPLE.COM

   Valid Starting           Expires            Principal
16-Jan-2014 17:54:30  17-Jan-2014 01:54:30  krbtgt/EXAMPLE.COM@EXAMPLE.COM

$ $ORACLE_HOME/bin/okinit user01@EXAMPLE.COM

Kerberos Utilities for IBM/AIX RISC System/6000: Version 11.2.0.4.0 - Production on 16-JAN-2014 18:15:02

Copyright (c) 1996, 2013 Oracle.  All rights reserved.

Password for user01@EXAMPLE.COM:
$ $ORACLE_HOME/bin/oklist

Kerberos Utilities for IBM/AIX RISC System/6000: Version 11.2.0.4.0 - Production on 16-JAN-2014 18:15:12

Copyright (c) 1996, 2013 Oracle.  All rights reserved.

Ticket cache: /var/opt/oracle/krb/cc/krb5cc_99
Default principal: user01@EXAMPLE.COM

   Valid Starting           Expires            Principal
16-Jan-2014 18:15:06  17-Jan-2014 02:15:02  krbtgt/EXAMPLE.COM@EXAMPLE.COM

In case of error, grep for “k5″ in the trace file.

Do the same on the PC01. To test okinit, temporary change the CC cache to a file


sqlnet.authentication_services=(kerberos5)
SQLNET.AUTHENTICATION_KERBEROS5_SERVICE=oracle

DIAG_ADR_ENABLED=off
TRACE_DIRECTORY_CLIENT=C:\oracle\krb5
TRACE_UNIQUE_CLIENT=on
TRACE_FILE_CLIENT=kerb_client

sqlnet.kerberos5_conf_mit=true
sqlnet.kerberos5_conf=C:\oracle\krb5\krb5.conf
#sqlnet.kerberos5_cc_name=OSMSFT://
sqlnet.kerberos5_cc_name=C:\oracle\krb5\krbcc

and test as in Unix.

Some errors will be easier to find out with a network sniffer on port 88

With AIX


tcpdump -v -v port 88

On Windows
Start -> Microsoft Network Monitor -> Microsoft Network Monitor -> New capture -> Display filter


Frame.Ethernet.Ipv4.TCP.Port == 88 or Frame.Ethernet.Ipv4.UDP.Port == 88

-> Apply -> Start

If you for instance only see UDP packets but no TCP packets, you probably hit bug 12635212.

I still have some KDC_ERR_S_PRINCIPAL_UNKNOWN errors with my working setup, don’t worry about those.

Ok, now that okinit works, the next step is to log in the database.

Check the adapters on the db server


$ adapters

Installed Oracle Net transport protocols are:

    IPC
    BEQ
    TCP/IP
    SSL
    RAW

Installed Oracle Net naming methods are:

    Local Naming (tnsnames.ora)
    Oracle Directory Naming
    Oracle Host Naming
    Oracle Names Server Naming
    NIS Naming

Installed Oracle Advanced Security options are:

    RC4 40-bit encryption
    RC4 56-bit encryption
    RC4 128-bit encryption
    RC4 256-bit encryption
    DES40 40-bit encryption
    DES 56-bit encryption
    3DES 112-bit encryption
    3DES 168-bit encryption
    AES 128-bit encryption
    AES 192-bit encryption
    AES 256-bit encryption
    MD5 crypto-checksumming
    SHA-1 crypto-checksumming
    Kerberos v5 authentication
    RADIUS authentication

Create the user on the database db01 on the server dbsrv01. You need to have OS_AUTHENT_PREFIX=”” and do not set REMOTE_OS_AUTHENT (if you have it set, why would you need Kerberos?)


SQL> create user user01 identified externally as 'user01@example.com';
User created.
SQL> grant create session to user01;
Grant succeeded.

Connect from the PC


$ sqlplus -L /@db01

SQL*Plus: Release 11.2.0.3.0 Production on Thu Jan 16 18:40:43 2014

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL>

This also works in Toad, SQL-Developer and other tools using the OCI thick client. Just let “User” and “Password” blank.

In SQLDeveloper, make sure you do not check Kerberos but you use OCI Thick and no username and password

There is probably a way to do it with the jdbc thin client as document in Note 1523651.1, I have not gone that far yet

Update:
Note 303436.1 : Improper format of configuration file: Remove TAB characters from KRB5.conf file. Replace with spaces.

Changing the log apply delay (DelayMins)

Whenever you change the DelayMins setting in Dataguard, you must remember it affects only logs that have not been shipped yet.


DGMGRL> show database sDB01 delaymins
  DelayMins = '5'

DGMGRL> edit DATABASE sDB01 set property delaymins=2;
Property "delaymins" updated

ARC3: Archive log thread 1 sequence 3199 available in 5 minute(s)
Tue Dec 03 15:34:59 2013
ARC0: Archive log thread 1 sequence 3200 available in 2 minute(s)
Tue Dec 03 15:35:15 2013

SQL> select sysdate, SEQUENCE# from v$managed_standby where process='MRP0'

SYSDATE              SEQUENCE#
------------------- ----------
2013-12-03_15:38:00       3199

The old logs are not affected. Let’s wait until the latest Delay=5 got applied.


Tue Dec 03 15:40:02 2013
Media Recovery Log /u01/app/oracle/admin/DB01/arch/DB01_1_3199_827686279.arc
Media Recovery Log /u01/app/oracle/admin/DB01/arch/DB01_1_3200_827686279.arc
Media Recovery Log /u01/app/oracle/admin/DB01/arch/DB01_1_3201_827686279.arc
Media Recovery Log /u01/app/oracle/admin/DB01/arch/DB01_1_3202_827686279.arc
Media Recovery Log /u01/app/oracle/admin/DB01/arch/DB01_1_3203_827686279.arc
Media Recovery Log /u01/app/oracle/admin/DB01/arch/DB01_1_3204_827686279.arc
Media Recovery Log /u01/app/oracle/admin/DB01/arch/DB01_1_3205_827686279.arc

All files which had a delay=2 were “pending” apply. Now we got the apply=2 behavior

Same if you increase the value


DGMGRL> edit DATABASE sDB01 set property delaymins=30;
Property "delaymins" updated

SQL> select sysdate, SEQUENCE# from v$managed_standby where process='MRP0';

SYSDATE              SEQUENCE#
------------------- ----------
2013-12-03_15:49:04       3224

ARC3: Archive log thread 1 sequence 3224 available in 2 minute(s)
Tue Dec 03 15:47:22 2013

Here again, the old logs are not affected, we need to wait until the last delay=2 got applied to get a delay=30 behavior.

While you cannot change the delay, there is still a way to workaround the problem.

If you want to immediately increase log to 30 minutes, turn off applying for half an hour.


DGMGRL> edit DATABASE sDB01 set state='APPLY-OFF';
Succeeded.
-- coffee break
DGMGRL> edit DATABASE sDB01 set state='APPLY-ON';
Succeeded.

If you want to decrease log from 30 to 2 minutes right now and immediately apply the old logs which have reached this threshold, use sqlplus


ARC1: Archive log thread 1 sequence 3253 available in 30 minute(s)
Tue Dec 03 16:01:26 2013
ARC3: Archive log thread 1 sequence 3254 available in 2 minute(s)
Tue Dec 03 16:01:37 2013

DGMGRL> edit DATABASE sDB01 set state='APPLY-OFF';
Succeeded.

SQL> recover automatic standby database until time '2013-12-03_16:01:30';
Media recovery complete.

DGMGRL> edit DATABASE sDB01 set state='APPLY-ON';
Succeeded.

I wrote on delay standby failover here : here

sqlnet.ora, sqlplus.exe and tnsping.exe inconsistencies

if you use tnsping.exe and sqlplus.exe, the way the sqlnet.ora and tnsnames.ora are located differs

Let’s take the following setup


C:\tmp>type dir1\sqlnet.ora
NAMES.DEFAULT_DOMAIN = (EXAMPLE.COM)
NAMES.DIRECTORY_PATH = (TNSNAMES)

C:\tmp>type dir1\tnsnames.ora
db.example.com=(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=srv1.example.com)(PORT=1521))(CONNECT_DATA=(SID=db01)))
db.example.org=(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=srv1.example.org)(PORT=1521))(CONNECT_DATA=(SID=db01)))

C:\tmp>type dir2\sqlnet.ora
NAMES.DEFAULT_DOMAIN = (EXAMPLE.ORG)
NAMES.DIRECTORY_PATH = (TNSNAMES)

C:\tmp>type dir2\tnsnames.ora
db.example.com=(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=srv2.example.com)(PORT=1521))(CONNECT_DATA=(SID=db02)))
db.example.org=(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=srv2.example.org)(PORT=1521))(CONNECT_DATA=(SID=db02)))

You set TNS_ADMIN to dir1 and your current directory is dir2.

Let’s try TNSPING.EXE first


C:\tmp>cd dir2

C:\tmp\dir2>set TNS_ADMIN=C:\tmp\dir1

C:\tmp\dir2>tnsping db

TNS Ping Utility for 64-bit Windows: Version 12.1.0.1.0 - Production on 25-NOV-2013 15:47:31

Copyright (c) 1997, 2013, Oracle.  All rights reserved.

Used parameter files:
C:\tmp\dir1\sqlnet.ora

Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=srv2.example.com)(PORT=1521))(CONNECT_DATA=(SID=db02)))
OK (0 msec)

TNSPING.EXE is using the sqlnet.ora in %TNS_ADMIN% directory (EXAMPLE.COM domain) and the tnsnames.ora in the current directory (db02)

Let’s try with sqlplus


C:\tmp>cd dir2

C:\tmp\dir2>set TNS_ADMIN=C:\tmp\dir1

C:\tmp\dir2>sqlplus -L system@db

SQL*Plus: Release 12.1.0.1.0 Production on Mon Nov 25 16:01:15 2013

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

Enter password:

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> select * from global_name;

GLOBAL_NAME
-------------------------------------------------
DB02.EXAMPLE.ORG

SQLPLUS.EXE is using the sqlnet.ora in the current directory (EXAMPLE.ORG) and the tnsnames.ora in the current directory (db02)

This does not reproduce on Linux

Monitoring details on your explain plan

We know how to generate an explain plan. SET AUTOT ON, the good old ambulance in Toad (which is no longer an ambulance those days), the explain plan button in SQL Developer or simply EXPLAIN PLAN


create table t1 partition by hash(object_id) partitions 16 as select * from dba_objects;
exec dbms_stats.gather_table_stats(user, 'T1')
explain plan for 
  select /*+ PARALLEL(x, 4) */ * from t1 x, t1 y 
  where x.object_name = y.object_name and x.owner != y.owner

Explain plan writes in the PLAN_TABLE and could be displayed with


SQL> select * from table(dbms_xplan.display)

PLAN_TABLE_OUTPUT
--------------------------------
Plan hash value: 2344570521

---------------------------------------------------------------
| Id  | Operation                  | Name     | Rows  | Bytes |
---------------------------------------------------------------
|   0 | SELECT STATEMENT           |          | 18287 |  3500K|
|   1 |  PX COORDINATOR            |          |       |       |
|   2 |   PX SEND QC (RANDOM)      | :TQ10002 | 18287 |  3500K|
|*  3 |    HASH JOIN               |          | 18287 |  3500K|
|   4 |     PX RECEIVE             |          | 19219 |  1839K|
|   5 |      PX SEND HYBRID HASH   | :TQ10001 | 19219 |  1839K|
|   6 |       STATISTICS COLLECTOR |          |       |       |
|   7 |        PX BLOCK ITERATOR   |          | 19219 |  1839K|
|   8 |         TABLE ACCESS FULL  | T1       | 19219 |  1839K|
|   9 |     BUFFER SORT            |          |       |       |
|  10 |      PX RECEIVE            |          | 19219 |  1839K|
|  11 |       PX SEND HYBRID HASH  | :TQ10000 | 19219 |  1839K|
|  12 |        PARTITION HASH ALL  |          | 19219 |  1839K|
|  13 |         TABLE ACCESS FULL  | T1       | 19219 |  1839K|
---------------------------------------------------------------

PLAN_TABLE_OUTPUT
--------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("X"."OBJECT_NAME"="Y"."OBJECT_NAME")
       filter("X"."OWNER"<>"Y"."OWNER")

But if you want to display progress on a currently running query, use DBMS_SQLTUNE (or Oracle Enterprise Manager SQL Monitoring):


SQL> set lin 150 longc 150 long 1000000;
SQL> select DBMS_SQLTUNE.REPORT_SQL_MONITOR(sql_plan_hash_value
  =>2344570521) from dual;

DBMS_SQLTUNE.REPORT_SQL_MONITOR(SQL_PLAN_HASH_VALUE=>2344570521) 
----------------------------------------------------------------
SQL Monitoring Report

SQL Text
------------------------------
select /*+ PARALLEL(x, 4) */ * from t1 x, t1 y where 
x.object_name = y.object_name and x.owner != y.owner

Global Information
------------------------------
 Status              :  EXECUTING
 Instance ID         :  1
 Session             :  SCOTT (25:10369)
 SQL ID              :  0dpj0fxm2gf81
 SQL Execution ID    :  16777216
 Execution Started   :  08/12/2013 14:48:26
 First Refresh Time  :  08/12/2013 14:48:26
 Last Refresh Time   :  08/12/2013 14:48:59
 Duration            :  34s
 Module/Action       :  SQL*Plus/-
 Service             :  SYS$USERS
 Program             :  sqlplus.exe
 Fetch Calls         :  19

DBMS_SQLTUNE.REPORT_SQL_MONITOR(SQL_PLAN_HASH_VALUE=>2344570521)
----------------------------------------------------------------

Global Stats
=================================================
| Elapsed |   Cpu   |  Other   | Fetch | Buffer |
| Time(s) | Time(s) | Waits(s) | Calls |  Gets  |
=================================================
|    0.25 |    0.13 |     0.12 |    19 |    752 |
=================================================

Parallel Execution Details (DOP=4 , Servers Allocated=8)
========================================================
|      Name      | Type  | Server# | Elapsed |   Cpu   |
|                |       |         | Time(s) | Time(s) |
========================================================
| PX Coordinator | QC    |         |    0.08 |    0.03 |
| p000           | Set 1 |       1 |    0.03 |    0.02 |
| p001           | Set 1 |       2 |    0.03 |    0.02 |
| p002           | Set 1 |       3 |         |         |
| p003           | Set 1 |       4 |    0.03 |    0.02 |
| p004           | Set 2 |       1 |    0.02 |    0.01 |
| p005           | Set 2 |       2 |    0.03 |    0.02 |
| p006           | Set 2 |       3 |    0.03 |    0.00 |
| p007           | Set 2 |       4 |    0.02 |    0.00 |
========================================================

SQL Plan Monitoring Details (Plan Hash Value=2344570521)
==================================================================
| Id   |          Operation          |  Rows   | Cost |   Rows   |
|      |                             | (Estim) |      | (Actual) |
==================================================================
| -> 0 | SELECT STATEMENT            |         |      |      271 |
| -> 1 |   PX COORDINATOR            |         |      |      271 |
|    2 |    PX SEND QC (RANDOM)      |   18287 |  202 |      615 |
|    3 |     HASH JOIN               |   18287 |  202 |      538 |
|    4 |      PX RECEIVE             |   19219 |   44 |    14481 |
|    5 |       PX SEND HYBRID HASH   |   19219 |   44 |    19219 |
|    6 |        STATISTICS COLLECTOR |         |      |    19219 |
|    7 |         PX BLOCK ITERATOR   |   19219 |   44 |    19219 |
|    8 |          TABLE ACCESS FULL  |   19219 |   44 |    19219 |
|    9 |      BUFFER SORT            |         |      |     1316 |
|   10 |       PX RECEIVE            |   19219 |  158 |    14481 |
|   11 |        PX SEND HYBRID HASH  |   19219 |  158 |    19219 |
|   12 |         PARTITION HASH ALL  |   19219 |  158 |    19219 |
|   13 |          TABLE ACCESS FULL  |   19219 |  158 |    19219 |
==================================================================

The small -> sign shows you where it is and display some actual (not estimates) info.

If I run it again :


==================================================================
| Id   |          Operation          |  Rows   | Cost |   Rows   |
|      |                             | (Estim) |      | (Actual) |
==================================================================
| -> 0 | SELECT STATEMENT            |         |      |     6451 |

For the same query, we see some progress (6451 rows now).

Check you have licensed the appropriate tuning options before using DBMS_SQLTUNE

Delete one billion row

To delete large number of rows, for instance rows with date until 2010, you can issue this simple statement.


SQL> DELETE FROM T WHERE C<DATE '2011-01-01';
1'000'000'000 rows deleted 
Elapsed: 23:45:22.01
SQL> commit;

This is perfectly fine. The table remains online, other users are not much affected (maybe they will not even notice the lower IO performance).

It will generate quite a lot of UNDO, and you will need enough space for archivelog and a large undo tablespace and a large undo retention setting (to prevent ORA-01555 snapshot too old).

If your table is like 100G big, you do it during week-end, you have 500Gb Undo and 250G free space in your archive destination, you will be fine. Well. Maybe.

There are workarounds where you create a new table then rename etc… but this is not the scope of this post and you will need to validate your index / foreign keys / online strategy with the application guys.

Another way to decrease runtime pro statement and undo requirement pro statement (but increase overall elapsed time) is to divided it chunks, for instance to delete 100’000’000 rows each night during 10 days.


SQL> DELETE FROM T WHERE C<DATE '2011-01-01' AND ROWNUM<=100000000;
100'000'000 rows deleted 
Elapsed: 04:11:15.31
SQL> commit;

Or if you want to delete in much smaller chunks to accomodate your tiny undo tablespace, you could try


BEGIN
  LOOP
    DELETE FROM T WHERE C<DATE '2011-01-01' AND ROWNUM <= 1000;
    EXIT WHEN SQL%ROWCOUNT = 0;
    COMMIT;
   END LOOP;
END;
/

This will run longer than a single transaction, but it is quite usefull if your undo tablespace is too small. Also if you abort it (CTRL-C or kill session), you will not lose all progresses (but you lose on integrity/atomicity) and your KILL SESSION will not last for ever. With a single transaction, your session may be marked as killed for hours/days…

When v$session_longops is not long enough

With large table scans, sometimes the estimated total work is far beyond reality


SQL> select message from v$session_longops where target='SCOTT.EMP';

MESSAGE
------------------------------------------------------------
Table Scan:  SCOTT.EMP: 7377612 out of 629683 Blocks done

The total work is the Oracle estimation :


SQL> select blocks from dba_tables where table_name='EMP';

    BLOCKS
----------
    629683

This may differ quite a lot from the segment size, for instance if the table is not very often analyzed :


SQL> select blocks, sysdate, last_analyzed from dba_tables where table_name='EMP';

    BLOCKS SYSDATE             LAST_ANALYZED
---------- ------------------- -------------------
    629683 2013-04-21_09:21:47 2007-10-13_21:40:58

SQL> select blocks from dba_segments where segment_name='EMP';

    BLOCKS
----------
   7749888

I have customized my very long ops query to deal with very long waits.


col target for a20
set lin 150 pages 40000 termout off
alter session set nls_currency='%';
col PCT_DONE for 990.00L jus r
col time_remaining for 999999

select
  lo.target,lo.sofar,seg.blocks,
  lo.ELAPSED_SECONDS*seg.blocks/lo.sofar-lo.ELAPSED_SECONDS TIME_REMAINING,
  100*lo.sofar/seg.blocks PCT_DONE
from
  dba_segments seg,
  v$session_longops lo
where
  lo.units='Blocks'
  and lo.totalwork>0 and (lo.time_remaining>0 or lo.time_remaining is null)
  and regexp_substr(lo.target,'[^.]+') = seg.owner
  and regexp_substr(lo.target,'[^.]+$') = seg.segment_name
/

How big was my database last month

If you backup your database at least weekly, you could check the datafile size from your backup history.

Without catalog :


select to_char(d,'"W"IW-IYYY') compl, sum(b)/1024/1024/1024*8192 GB 
from 
(
  select max(DATAFILE_BLOCKS) b, trunc(completion_time,'IW') d 
  from v$backup_datafile 
  group by FILE# ,trunc(completion_time,'IW')
) 
group by d 
order by d;


COMPL        GB
-------- ------
W30-2012   3.73
W31-2012   4.84
W32-2012   5.00
W33-2012   5.05
W34-2012   5.35
W35-2012   5.80
W36-2012   6.12
W37-2012   6.39
W38-2012    .93
W39-2012   7.02
W40-2012   7.56
W41-2012   7.72
W42-2012   7.88
W43-2012   8.08
W44-2012   8.83
W45-2012   9.03
W46-2012   9.45
W47-2012   9.61
W48-2012  10.11
W49-2012  10.29
W50-2012  10.38

The history mostly depends on control_file_record_keep_time. If you do not use an rman catalog, set it to a high value like 93 (3M) or 366 (1Y)

With the rman catalog, use the RC_ view


select DB_NAME,to_char(d,'"W"IW-IYYY') compl, sum(b)/1024/1024/1024*8192 GB 
from 
(
  select DB_NAME,max(DATAFILE_BLOCKS) b, trunc(completion_time,'IW') d 
  from rc_backup_datafile 
  group by DB_NAME,FILE# ,trunc(completion_time,'IW')
) 
group by DB_NAME,d 
order by DB_NAME,d;


DB_NAME  COMPL        GB
-------- -------- ------
DB01     W30-2012   3.73
DB01     W31-2012   4.83
DB01     W32-2012   5.00
DB01     W33-2012   5.05
DB01     W34-2012   5.34
DB01     W35-2012   5.79
DB01     W36-2012   6.11
DB01     W37-2012   6.39
DB01     W38-2012    .93
DB01     W39-2012   7.01
DB01     W40-2012   7.56
DB01     W41-2012   7.71
DB01     W42-2012   7.87
DB01     W43-2012   8.08
DB01     W44-2012   8.82
DB01     W45-2012   9.02
DB01     W46-2012   9.44
DB01     W47-2012   9.60
DB01     W48-2012  10.10
DB01     W49-2012  10.28
DB01     W50-2012  10.37

If you need to check which table grows the most, check How big was my table yesterday. But remember, RMAN backup is free to use, AWR and the WRI$ tables require the diagnostic pack and the Enterprise edition

How big was my table yesterday

Oracle saves a lot of information on your behalf. Whenever you get yourself an AWR reported, you access some historic tables (included in the Diagnostic Pack).

Those tables could also be accessed manually.


SELECT savtime,owner,object_name,rowcnt,blkcnt
FROM sys.WRI$_OPTSTAT_TAB_HISTORY w,
  dba_objects o
WHERE 
   o.owner='SCOTT'
   AND o.object_name='EMP'
   and o.object_id = W.OBJ#
ORDER BY o.owner, o.object_name, w.savtime;


SAVTIME           OWNER    OBJECT_NAME     ROWCNT     BLKCNT
----------------- -------- ----------- ---------- ----------
2012-11-06 06:49  SCOTT    EMP           13215425     120077
2012-11-13 07:28  SCOTT    EMP           12678535     120077
2012-11-20 03:15  SCOTT    EMP           12860640     120077
2012-11-27 03:19  SCOTT    EMP           13045850     120077
2012-12-04 05:41  SCOTT    EMP           13326460     120077

To increase the retention, use DBMS_WORKLOAD_REPOSITORY.modify_snapshot_settings

tnsping and instant client

Mostly when you install your instant client, you will not have tnsping handy. You could well try to copy it from a full client, but this is cumbersome to just ping your instance.

I just created one function in my .profile


whence tnsping >/dev/null 2>&1 || 
  tnsping() { 
    sqlplus -L -s x/x@$1 </dev/null | 
      grep ORA- | 
        (grep -v ORA-01017 || echo OK)
  }

and tested it


$ tnsping db999
ORA-12154: TNS:could not resolve the connect identifier specified
$ tnsping db01
OK
$ tnsping db02
ORA-12541: TNS:no listener

Create database link and logon trigger

Today I could not understand why I was getting ORA-1031 with create database link.

After analysis, I found out a logon trigger that changed the current schema. As Tom always says, triggers are evil…

DEMO:


SQL> create user u1 identified by xxx;

User created.

SQL> grant create session, create database link to u2 identified by xxx;

Grant succeeded.

SQL> create trigger evil after logon on database  begin 
  2  execute immediate 'alter session set current_schema=u1';end;
  3  /

Trigger created.

SQL>
SQL> conn u2/xxx
Connected.
SQL> create database link l;
create database link l
                     *
ERROR at line 1:
ORA-01031: insufficient privileges

SQL> show user
USER is "U2"
SQL> select sys_context('USERENV','CURRENT_SCHEMA') from dual;

SYS_CONTEXT('USERENV','CURRENT_SCHEMA')
-----------------------------------------
U1

SQL> alter session set current_schema=u2;

Session altered.

SQL> create database link l;

Database link created.

Drop database link in another schema

Today I wrote this script :

drop_database_link.sql


accept owner char prompt "Enter database link owner : "
accept db_link char prompt "Enter link name : "

begin
  dbms_scheduler.create_job(
    job_name=>'&owner..drop_database_link',
    job_type=>'PLSQL_BLOCK',
    job_action=>'BEGIN execute immediate ''drop database link &db_link'';END;'
  );
  dbms_scheduler.run_job('&owner..drop_database_link',false);
  dbms_lock.sleep(2);
  dbms_scheduler.drop_job('&owner..drop_database_link');
end;
/

I am using the scheduler to run a job as another user. The database link owner does not need to have any privilege, neither CREATE SESSION nor CREATE JOB. It could locked and expired.

ORA-01722: invalid number and sql loader

Your manager asked you to load an Excel file in a table. Ok, you look at the header, create a basic table with meaningfull datatype, you open Excel and save as csv, you load your data. But then you get ORA-1722. What happened ?

Ok, let’s do it


create table t(x number not null, y number, z number not null);


LOAD DATA
INFILE *
INTO TABLE T
TRUNCATE
FIELDS TERMINATED BY ';'
(X, Y, Z)
BEGINDATA
1;1;1
2; ;2
3;3;3


$ sqlldr scott/tiger control=foo.ctl
$ vi foo.log
...
Record 2: Rejected - Error on table T, column Y.
ORA-01722: invalid number

Here it is pretty eye-popping, but you probably have 10 years of market data to load with hundreds of columns and most of the columns are empty or/and obsolete.

The thing is, Excel did put a space for your “number” datatype, space is not a valid number !


SQL> select to_number(' ') from dual;
select to_number(' ') from dual
                 *
ERROR at line 1:
ORA-01722: invalid number

A workaround is for each nullable numeric column to specify nullif column=blank


LOAD DATA
INFILE *
INTO TABLE T
TRUNCATE
FIELDS TERMINATED BY ';'
(X, Y NULLIF Y=BLANKS, Z)
BEGINDATA
1;1;1
2; ;2
3;3;3

$ sqlldr scott/tiger control=foo.ctl
$ vi foo.log
...
Table T:
  3 Rows successfully loaded.
  0 Rows not loaded due to data errors.
  0 Rows not loaded because all WHEN clauses were failed.
  0 Rows not loaded because all fields were null.

I hope I will remember this next time I am sqlloading from Excel into Oracle !

old-hash, SHA-1, SHA-2/512

Until pretty recently, only the dubious unsalted proprietary algorithm was available to store Oracle passwords. A bunch of tool where at the time able to decode any 6-8 characters in no time, and the rainbow approach was to precalculate all possibles passwords for a specific user.

Those time are not really for away, only starting at Oracle 11g, you could have salted/case sensitive passwords. Salted means that Scott may have many different passwords keys for tiger.


 SQL> select spare4 from user$ where name='SCOTT';
SPARE4
----------------------------------------------------------------
S:96A5FF65BFF84D7AAC6F8F00879881E8506FE57F555E5BA2927B606DC4F1

SQL> alter user scott identified by tiger;

User altered.

SQL> select spare4 from user$ where name='SCOTT';
SPARE4
----------------------------------------------------------------
S:AE23FB94A462C44A75040CE3BA731E3EF08C4A270F5940491045CBCEF63C

Some users may have only the 10g version (password not changed after migrating to 11g), some may have the 11g version of both, and -who knows- some may have already have SHA-2/512 passwords. SHA2 has many advantages. The chance that 2 passwords provides exactly the same string are much lower than in SHA1 (collision) and it performs twice faster on 64 bits servers.


SQL> select username, password_versions from dba_users where username like 'U_;
USERNAME                       PASSWORD
------------------------------ --------
U1                             10G
U2                             11G
U3                             10G 11G
U4                             12C

Probably you never saw this unless you are in beta 12. But actually it is documented in the 11gR2 Documentation.

12C if a new SHA-2 based SHA-512 hash exists

shutdown timeout

I do not like shutdown abort (see this post). I always use shutdown immediate and it always work… well almost always.

Today I discovered a 9iR2 new feature : shutdown timeout !
Shutdown Timeout
If all events blocking the shutdown do not occur within one hour, the shutdown operation aborts with the following message: ORA-01013: user requested cancel of current operation.

Ok, if and only if I am getting this ORA-1013, I shutdown abort, startup, shutdown immediate.

It is very unusual that a shutdown immediate does not terminate in 1 hour, and hard to reproduce. For this test case, I am doing a shutdown normal

1) make sure you have at least one other session open
2) shutdown normal
3) wait about 60 minutes (defined in _shutdown_completion_timeout_mins, not a supported parameter to change)


SQL> shutdown normal
ORA-01013: user requested cancel of current operation
SQL> 

Now we received a ORA-1013 (but I did not use CTRL-C). The instance is now half-stopped, most sessions and background processes like MMON, CJQ, SMCO are already dead and it is probably a good idea to restart it properly. Maybe with startup force and shutdown immediate.

SQL> startup force
ORACLE instance started.

Total System Global Area 1069252608 bytes
Fixed Size                  2166160 bytes
Variable Size             658510448 bytes
Database Buffers          402653184 bytes
Redo Buffers                5922816 bytes
Database mounted.
Database opened.
SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.

This is all documented :
After ORA-01013 occurs, you must consider the instance to be in an unpredictable state…If subsequent SHUTDOWN commands continue to fail, you must submit a SHUTDOWN ABORT

I am extremly happy to have discovered this, because it will still allow a shutdown abort after one hour of shutdown immediate, which is an extremly rare case, but it is an enhancement for my backup scripts.

OCM 11g upgrade

Last Friday I went to Munich for THE upgrade exam. I have a few recommendations based on my preparation.

DISCLAIMER: no recommendation is based on the exam

Check you have a similar environment. You will get 11gR1 database and 10g entreprise manager. Personnaly I did my preparation on 11gR2 and 11g/12c em, I did not find useful to prepare on some oldish version. You should have some Linux / Unix around. It does not really matter if you use AIX, HPUX, Solaris or Linux.

I bought myself a keyboard with German Layout (the one with ß, Ä and Ö) to gain speed

Read each topic carefully and get proficiency:
– Database:
I know the meaning of the initialisation parameters. In need, I can quickly check one in the Reference
I am familiar with the sql syntax and have read the SQL Reference more than once

– Network configuration :
I know how to configure sqlnet.ora, tnsnames.ora and listener.ora without looking at the documentation.
I have carefully tested all possible parameters in Net Service Guide

– ADR :
I know how to set up the ADR location in the database and network.
I know how to use the command line utility, documented in utilities -> adrci

– RMAN :
I know how to configure RMAN with the configure statement. I am efficient in doing backups and restores
I have read the RMAN Reference carefully.

– Dataguard :
I am confident with the dgmgrl utility and with oem to configure and monitor dataguard.
I have read the Dataguard Broker Guide

– Materialized view
I have read and tested the examples in the Datawarehouse Guide -> Materialized views

– Secure Lob
I have played with lobs as basicfile and securefiles to check the differences and new features. I have read the secure files guide

– Streams
Streams is a very complex product. I have attended last year a five days course in Paris to only realise it is utterly tricky to maintain and debug, it case of errors.
I know where to look at if the capture, propagation or apply fail. Mostly in alert log, but also in the DBA views. I know how to restart the different processes. I am confident with both the OEM and PL/SQL Packages DBMS_STREAMS_*.
I have read the Streams Concept and Streams Replication Administration

– Resource manager
I know both the GUI and command line. I have read Admin guide -> Resource Manager

– Advisors and baselines
A very valuable resource to read is the 2 Days Perf Guide. It may sound strange to prepare an OCM exam with a 2-Days guide, but it is an awesome lecture.
To get deeper and learn the API usage, read Performance Tuning -> Optimizing SQL Statements

– Replay a capture workload
I learnt both the PL/SQL and EM usage in the Real Application Testing User Guide

Almost all the other topics are self explanatory and almost trivial once you know the syntax by heart…

I spent over one year preparing this exam and if you browse my recent posts, you will find detailled example that somehow relate to miscellaneous findings
Check mount option in linux
Transport tablespace over db links
Fast start failover
On star transformation
my first ADR package

For those of my readers who are on this way, good luck!

PS: no, I do not know the result yet…