A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/pandas-dev/pandas/issues/10366 below:

Irregular errors when reading certain categorical strings from hdf · Issue #10366 · pandas-dev/pandas · GitHub

It seems that there is something bad happening when we use certain strings with special characters AND the empty string with categoricals:

    # -*- coding: latin-1 -*-
    import pandas
    import os

    examples = [
            pandas.Series(['EÉ, 17', '', 'a', 'b', 'c'], dtype='category'),
            pandas.Series(['EÉ, 17', 'a', 'b', 'c'], dtype='category'),
            pandas.Series(['', 'a', 'b', 'c'], dtype='category'),
            pandas.Series(['EE, 17', '', 'a', 'b', 'c'], dtype='category'),
            pandas.Series(['øü', 'a', 'b', 'c'], dtype='category'),
            pandas.Series(['Aøü', '', 'a', 'b', 'c'], dtype='category'),
            pandas.Series(['EÉ, 17', 'øü', 'a', 'b', 'c'], dtype='category')
            ]

    def test_hdf(s):
        f = 'testhdf.h5'
        if os.path.exists(f):
            os.remove(f)
        s.to_hdf(f, 'data', format='table')
        return pandas.read_hdf(f, 'data')

    for i, s in enumerate(examples):
        flag = True
        e = ''
        try:
            test_hdf(s)
        except Exception as ex:
            e = ex
            flag = False
        print('%d: %s\t%s\t%s' % (i, 'pass' if flag else 'fail', s.tolist(), e))

Results in:

    0: fail ['EÉ, 17', '', 'a', 'b', 'c']   Categorical categories must be unique
    1: pass ['EÉ, 17', 'a', 'b', 'c']
    2: pass ['', 'a', 'b', 'c']
    3: pass ['EE, 17', '', 'a', 'b', 'c']
    4: pass ['øü', 'a', 'b', 'c']
    5: fail ['Aøü', '', 'a', 'b', 'c']  Categorical categories must be unique
    6: pass ['EÉ, 17', 'øü', 'a', 'b', 'c']

Not sure if I am using this incorrectly or if this is actually a corner case.


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4