RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://github.com/oracle/python-oracledb/issues/279 below:

'utf-8' codec can't decode byte (outputtypehandler encoding_errors="replace" not being honored) · Issue #279 · oracle/python-oracledb · GitHub

Discussed in #272

^{Originally posted by rh8056 December 20, 2023}
Hello,

I posted this question on StackExchange yesterday as well, but figured it would make sense to ask here as well.

I'm struggling to deal with what I think is corrupt data stored in an Oracle database when reading it in a python script. I have the following:

    def output_type_handler(cursor, metadata):
      if metadata.type_code is oracledb.DB_TYPE_VARCHAR:
        return cursor.var(metadata.type_code,
                          arraysize=cursor.arraysize,
                          encoding_errors="replace")

    mydb, mycursor = connectOracle()
    mycursor.outputtypehandler = output_type_handler

    mycursor.execute('''select file_id, md5_value,
                            case when file_content is not null
                                then
                                    utl_raw.cast_to_varchar2(dbms_lob.substr(file_content))
                                else
                                    ''
                                end as FILE_CONTENT from archive_file where archive_id = 123 and file_name = 'file_name.txt' ''')
    row = mycursor.fetchone()
    print(row)

When I run this, I'm getting the following:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc4 in position 647: invalid continuation byte
If I run this directly in SQL Developer, I get an output, with � in the file content. My SQL Developer settings are set to UTF-8, and my Oracle database NLS_NCHAR_CHARACTERSET is set to AL32UTF8. My understanding with the outputtypehandler change is that my python script would also output a � in this instance.

I added some print() commands inside of the def output_type_handler() to verify that it is indeed being called, and I saw output, so it appears that it is, but it also seems like the encoding_errors="replace" is being ignored. What am I missing here? Thanks!

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4