Translating User Agent Strings

For the Analyse Webserver Logfile Application (AWL) I wanted to show better information instead of just the Useragent String, because they are a mess and contain confusing information.

As an example look at this Useragent String: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.109 Safari/537.36

I rather would want to show it like this: Chrome (59) OS X

There are various discussions and blogposts from more knowledgeable people out there, discussing UA and their misinformation. Due to missing regulatory those strings can’t be easily parsed with a regular expression.

Therefor I was falling back to the second best option: invoking a webservice.

After a little research I decided to go with a free webservice provided by UserAgentString.com, where you simply pass an URL encoded UA string.

As you’ve probably seen in my first blogpost for AWL I created a small datamodel to hold UA information in a separate table. That made it easy to set up a maintenance screen to edit the information and provide a more readable alias for it.

Obviously I didn’t want to manually edit thousands of different UA’s and decided to invoke the webservice through a button click (see button “Parse User Agent” in above screenshot).

Here is the Dynamic Action Code for “Execute PL/SQL Code” triggered by Button click.

BEGIN
    BEGIN
        FOR cUA IN
          ( SELECT ID
                 , AGENT_TYPE
                 , AGENT_NAME
                 , AGENT_VERSION
                 , OS_TYPE
                 , OS_NAME
                 , OS_VERSION_NAME
                 , OS_VERSION_NUMBER
                 , OS_PRODUCER
                 , OS_PRODUCER_URL
                 , LINUX_DISTRIBUTION
                 , AGENT_LANGUAGE
                 , AGENT_LANGUAGE_TAG
              FROM 
                 ( SELECT HTTPURITYPE('http://www.useragentstring.com/?uas='||utl_url.escape(USER_AGENT_STRING)||'&getJSON=all').getClob() AS UA_JSON
                        , ID
                     FROM AWL_USER_AGENT
                    WHERE ROWNUM < 100
                      AND AGENT_TYPE IS NULL
                 )
                 , JSON_TABLE 
                     ( UA_JSON
                     , '$'
                       COLUMNS 
                         ( AGENT_TYPE         VARCHAR2(2000) PATH '$.agent_type'
                         , AGENT_NAME         VARCHAR2(2000) PATH '$.agent_name'
                         , AGENT_VERSION      VARCHAR2(2000) PATH '$.agent_version'
                         , OS_TYPE            VARCHAR2(2000) PATH '$.os_type'
                         , OS_NAME            VARCHAR2(2000) PATH '$.os_name'
                         , OS_VERSION_NAME    VARCHAR2(2000) PATH '$.os_versionName'
                         , OS_VERSION_NUMBER  VARCHAR2(2000) PATH '$.os_versionNumber'
                         , OS_PRODUCER        VARCHAR2(2000) PATH '$.os_producer'
                         , OS_PRODUCER_URL    VARCHAR2(2000) PATH '$.os_producerUrl'
                         , LINUX_DISTRIBUTION VARCHAR2(2000) PATH '$.linux_distribution'
                         , AGENT_LANGUAGE     VARCHAR2(2000) PATH '$.agent_language'
                         , AGENT_LANGUAGE_TAG VARCHAR2(2000) PATH '$.agent_languageTag'
                         )
                     ) 
          )
        LOOP
            UPDATE AWL_USER_AGENT
               SET AGENT_TYPE         = cUA.AGENT_TYPE         
                 , AGENT_NAME         = cUA.AGENT_NAME         
                 , AGENT_VERSION      = cUA.AGENT_VERSION      
                 , OS_TYPE            = cUA.OS_TYPE            
                 , OS_NAME            = cUA.OS_NAME            
                 , OS_VERSION_NAME    = cUA.OS_VERSION_NAME    
                 , OS_VERSION_NUMBER  = cUA.OS_VERSION_NUMBER  
                 , OS_PRODUCER        = cUA.OS_PRODUCER        
                 , OS_PRODUCER_URL    = cUA.OS_PRODUCER_URL    
                 , LINUX_DISTRIBUTION = cUA.LINUX_DISTRIBUTION 
                 , AGENT_LANGUAGE     = cUA.AGENT_LANGUAGE     
                 , AGENT_LANGUAGE_TAG = cUA.AGENT_LANGUAGE_TAG 
             WHERE ID = cUA.ID
            ;
        END LOOP;
    EXCEPTION
        WHEN OTHERS THEN
            APEX_DEBUG.ERROR(Utl_Http.get_detailed_sqlcode  || Utl_Http.Get_Detailed_Sqlerrm);
            APEX_DEBUG.ERROR(DBMS_UTILITY.FORMAT_ERROR_BACKTRACE);
    END;
    --
    -- try to make it prettier to read the UA agent and construct an Alias (can be changed in admin section)
    UPDATE AWL_USER_AGENT
       SET ALIAS = SUBSTR(DECODE
                     ( AGENT_TYPE
                     , 'unknown', AGENT_TYPE||' ('||USER_AGENT_STRING||')'
                     , 'Crawler', AGENT_TYPE||' ('||AGENT_NAME||')'
                     , 'Browser', AGENT_NAME||' ('||REGEXP_REPLACE(AGENT_VERSION,'(\d*)\.(\d*)(.*)','\1')||') '||OS_NAME
                     , USER_AGENT_STRING
                     ),1,100)
     WHERE ALIAS      IS     NULL
       AND AGENT_TYPE IS NOT NULL
    ;
END;

Let’s look at the sourcecode in a bit more detail to see whats what:

  • Line 18 invokes the webservice through HTTPURITYPE. The webservice itself returns data in JSON format, HTTPURITYPE passes this back as a CLOB
  • Line 18-22 loops through our useragent table for the first 100 (99) unprocessed records
  • Line 24-39 converts the JSON-CLOB to a JSON_TABLE which we can select from just as if it would be a regular database table
  • Line 44-57 updates the useragent table with the received information
  • Line 67-76 now calculates an alias out of the given information. The only reason why this is done outside of the loop is that it was added at a later time and is meant to process also older records

Luckily the webservice is available through http, that means no certificate or wallet is needed.

Leave a Reply

Your email address will not be published. Required fields are marked *