Category Archives: unicode

import into UTF8 database

A common error when you import single-byte characters (e.g. iso8859p1 or mswin1252) into multi-bytes databases (e.g. utf8) is ORA-12899: value too large for column.

The root cause is the default semantics in a database being BYTE


SQL> select VALUE, ISDEFAULT 
  from v$parameter 
  where NAME='nls_length_semantics'
VALUE   ISDEFAULT
------- ---------
BYTE    TRUE

It means, one char equals one byte. But after conversion, one char is larger than one byte and does not fit any longer.

single-byte


SQL> select VALUE 
  from nls_database_parameters 
  where parameter='NLS_CHARACTERSET';
VALUE
-------------
WE8MSWIN1252
SQL> create table t(x char(1));
Table created.
SQL> insert into t values ('é');
1 row created.
SQL> commit;
Commit complete.
$ expdp scott/tiger dumpfile=t.dmp tables=t
. . exported "SCOTT"."T"            1 rows

multi-byte


SQL> select VALUE 
  from nls_database_parameters 
  where parameter='NLS_CHARACTERSET';
VALUE
-----------
UTF8
$ impdp scott/tiger dumpfile=t.dmp
Processing object type TABLE_EXPORT/TABLE/TABLE
Processing object type 
  TABLE_EXPORT/TABLE/TABLE_DATA
ORA-02374: conversion error loading table 
  "SCOTT"."T"
ORA-12899: value too large for column X 
  (actual: 2, maximum: 1)
ORA-02372: data for row: X : 0X'E9'
. . imported "SCOTT"."T"            0 out of 1 rows

How do I import my data?

1) import the metadata


$ impdp scott/tiger dumpfile=t.dmp content=metadata_only
Processing object type TABLE_EXPORT/TABLE/TABLE

2) change the char_used of the column(s) from (B)yte to (C)har

SQL> select 
  column_name, char_used, data_length, data_type 
  from user_tab_columns 
  where table_name='T' and char_used='B';
COLUMN_NAME  C DATA_LENGTH DATA_TYPE
------------ - ----------- ---------
X            B           1 CHAR
SQL> alter table t modify x char(1 char);
Table altered.

3) import the data

$ impdp scott/tiger dumpfile=t.dmp content=data_only
Processing object type 
  TABLE_EXPORT/TABLE/TABLE_DATA
. . imported "SCOTT"."T"            1 rows

4) check

SQL> select x, length(x), lengthb(x) from t;
X  LENGTH(X) LENGTHB(X)
- ---------- ----------
é          1          2

My column has now a length of one char and two bytes.