How to calculate size of a CLOB in Bytes?

Recently i was dealing with a translation problem regarding multibyte characters (cyrillic). They were stored in a NCLOB and were later on moved to a VARCHAR2.

Before moving the data i had to ensure, that there will be enough space in the VARCHAR2 columns which were defined as VARCHAR2(255 BYTE).

That’s where my travel started, i had to figure out how many bytes a CLOB-Value would consume when storing in a VARCHAR2 column.

LENGTH(<clob-column>) fails, because it returns the number of characters

DBMS_LOB.getLength(<clob-column>) fails as well, because it returns number of characters

LENGTHB doesn’t work on CLOBs, so i thought: implicit type conversion, so i came up with this

LENGTHB(DBMS_LOB.SUBSTR(<clob-column>,1,2000)) this one looked promising, but it failed at the end because it would return the number of bytes this string uses in the CLOB Characterset (which is AL16UTF16)

The winner is: LENGTHB(TO_CHAR(SUBSTR(<clob-column>,1,4000))) this one really returns the size in bytes the string would consume when stored in a VARCHAR2 column.

Thanks to explicit type conversion!



  1. Steve on December 18, 2010

    I still haven’t come up with a good way to determine if a CLOB > 4K contains any multibyte characters. I ran into many of the same limitations you did above. I ended up writing a java program that read each character and determined if its code point value was greater than 255. A really ugly hack.

  2. Peter Raganitsch on December 20, 2010

    Steve,

    is it important to know whether the resulting string contains multibyte characters?

    To me it was enough to know, if the resulting string exceeds a certain length, no matter if there are multibyte characters in it, or not.

  3. Quentin Crisp on September 21, 2011

    Thanks you are a life saver!

  4. Filippos on December 7, 2011

    I ran into the same problem, albeit when trying to calculate the length of XMLTYPE contents.
    Finally the solution was to write my own function that returns a such a CLOB that LENGTH would always give the correct value !
    Just in case it is needed, I will provide my solution here :

    create or replace
    FUNCTION “BLOB_TO_CLOB” (p_blob IN BLOB) RETURN CLOB
    AS

    v_clob CLOB;
    v_dest_offset INTEGER := 1;
    v_source_offset INTEGER := 1;
    v_lang_context INTEGER := DBMS_LOB.DEFAULT_LANG_CTX;
    v_warning INTEGER := DBMS_LOB.WARN_INCONVERTIBLE_CHAR;

    BEGIN

    DBMS_LOB.CREATETEMPORARY(v_clob, TRUE);

    DBMS_LOB.CONVERTTOCLOB
    (
    DEST_LOB => v_clob,
    SRC_BLOB => p_blob,
    AMOUNT => DBMS_LOB.LOBMAXSIZE,
    DEST_OFFSET => v_dest_offset,
    SRC_OFFSET => v_source_offset,
    BLOB_CSID => DBMS_LOB.DEFAULT_CSID,
    LANG_CONTEXT => v_lang_context,
    WARNING => v_warning
    );

    RETURN v_clob;

    END BLOB_TO_CLOB;

    SELECT LENGTH(BLOB_TO_CLOB(.getBlobVal(NLS_CHARSET_ID(‘AL32UTF8″)))
    FROM DUAL;

  5. Kiran on January 6, 2012

    Hi Filippos,
    what is the .getblobval which you are using in your query?


0