Saturday, January 30, 2010

To foreign realms and back again

No travel involved, but I had a few exotic characters in a column which needed resolution.

Firstly, I'll have a look for affected rows. I excluded any entries that only contain characters that I knew were acceptable.

select dump(a,1016), a, b
from
 (select regexp_replace(COLUMN,'[[:alnum:]/''%()> -.:=;[]','') a,
         COLUMN b
  from TABLE)
where a is not null
order by a;

The DUMP told me what the characters were, with the '16' option showing them in Hex rather than decimal, and the added 1000 to show the characterset (though that wasn't strictly necessary). Then I can use those HEX values in utl_raw.cast_to_varchar2 to make them into a string that I can use in a REPLACE.

In this demonstration I'll use those 'Smart Quote' characters that come from Word etc.

update TABLE
set COLUMN =
   translate(COLUMN,utl_raw.cast_to_varchar2('9192'),'""');
where COLUMN like '%'|| utl_raw.cast_to_varchar2('9192')||'%';


You can use DUMP(col,1001) and CHR(145)||CHR(146) [or whatever your characters are]. The advantage of UTL_RAW.CAST_TO_VARCHAR2 is that multiple characters can be included in a single value, which is handy for scripts or dynamic SQL and is more compact when dealing with longer strings.

1 comment:

Anonymous said...

Thanks really resolved my issue with box/square like character that comes along with word or excel.