Recently i was dealing with a translation problem regarding multibyte characters (cyrillic). They were stored in a NCLOB and were later on moved to a VARCHAR2.
Before moving the data i had to ensure, that there will be enough space in the VARCHAR2 columns which were defined as VARCHAR2(255 BYTE).
That’s where my travel started, i had to figure out how many bytes a CLOB-Value would consume when storing in a VARCHAR2 column.
LENGTH(<clob-column>) fails, because it returns the number of characters
DBMS_LOB.getLength(<clob-column>) fails as well, because it returns number of characters
LENGTHB doesn’t work on CLOBs, so i thought: implicit type conversion, so i came up with this
LENGTHB(DBMS_LOB.SUBSTR(<clob-column>,1,2000)) this one looked promising, but it failed at the end because it would return the number of bytes this string uses in the CLOB Characterset (which is AL16UTF16)
The winner is: LENGTHB(TO_CHAR(SUBSTR(<clob-column>,1,4000))) this one really returns the size in bytes the string would consume when stored in a VARCHAR2 column.
Thanks to explicit type conversion!
I still haven’t come up with a good way to determine if a CLOB > 4K contains any multibyte characters. I ran into many of the same limitations you did above. I ended up writing a java program that read each character and determined if its code point value was greater than 255. A really ugly hack.
Steve,
is it important to know whether the resulting string contains multibyte characters?
To me it was enough to know, if the resulting string exceeds a certain length, no matter if there are multibyte characters in it, or not.
Thanks you are a life saver!
I ran into the same problem, albeit when trying to calculate the length of XMLTYPE contents.
Finally the solution was to write my own function that returns a such a CLOB that LENGTH would always give the correct value !
Just in case it is needed, I will provide my solution here :
create or replace
FUNCTION “BLOB_TO_CLOB” (p_blob IN BLOB) RETURN CLOB
AS
v_clob CLOB;
v_dest_offset INTEGER := 1;
v_source_offset INTEGER := 1;
v_lang_context INTEGER := DBMS_LOB.DEFAULT_LANG_CTX;
v_warning INTEGER := DBMS_LOB.WARN_INCONVERTIBLE_CHAR;
BEGIN
DBMS_LOB.CREATETEMPORARY(v_clob, TRUE);
DBMS_LOB.CONVERTTOCLOB
(
DEST_LOB => v_clob,
SRC_BLOB => p_blob,
AMOUNT => DBMS_LOB.LOBMAXSIZE,
DEST_OFFSET => v_dest_offset,
SRC_OFFSET => v_source_offset,
BLOB_CSID => DBMS_LOB.DEFAULT_CSID,
LANG_CONTEXT => v_lang_context,
WARNING => v_warning
);
RETURN v_clob;
END BLOB_TO_CLOB;
SELECT LENGTH(BLOB_TO_CLOB(.getBlobVal(NLS_CHARSET_ID(‘AL32UTF8″)))
FROM DUAL;
Hi Filippos,
what is the .getblobval which you are using in your query?