In [1]: import pandas as pd In [2]: df = pd.DataFrame({'Number' : [1, 2], 'Date' : ["2017-03-02"] * 2, 'Str' : ["foo", "inf"]}) In [3]: df Out[3]: Date Number Str 0 2017-03-02 1 foo 1 2017-03-02 2 inf In [4]: df.groupby(['Number']).apply(lambda x: x.iloc[0]) Out[4]: Date Number Str Number 1 2017-03-02 1 foo 2 2017-03-02 2 inf In [5]: df.Date = pd.to_datetime(df.Date) In [6]: df Out[6]: Date Number Str 0 2017-03-02 1 foo 1 2017-03-02 2 inf In [7]: df.groupby(['Number']).apply(lambda x: x.iloc[0]) Out[7]: Date Number Str Number 1 2017-03-02 1 NaN 2 2017-03-02 2 infProblem description
When I change the type of the Date column to a Pandas datetime, it causes other columns' types to change in unexpected ways when doing a group-by/apply. Notice the contents of the "Str" column changes to a numeric type in the final group-by/apply (a contributing factor is probably that one of the elements is the string "inf"). The "inf" value has become inf, and the "foo" value has become NaN.
Expected OutputI expect the Str column to remain a string type, and contain the original strings. I.e.:
Date Number Str 0 2017-03-02 1 foo 1 2017-03-02 2 infOutput of
pd.show_versions()
INSTALLED VERSIONS ------------------ commit: None python: 2.7.11.final.0 python-bits: 64 OS: Linux OS-release: 3.10.0-327.10.1.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8
pandas: 0.18.1
nose: 1.3.7
pip: None
setuptools: 0.6
Cython: 0.24.1
numpy: 1.11.1
scipy: 0.18.0
statsmodels: 0.6.1
xarray: 0.7.0
IPython: 5.0.0
sphinx: 1.3.5
patsy: 0.4.1
dateutil: 2.5.3
pytz: 2016.6.1
blosc: None
bottleneck: 1.1.0
tables: 3.2.3.1
numexpr: 2.6.1
matplotlib: 1.5.1
openpyxl: 2.3.2
xlrd: 1.0.0
xlwt: 1.1.2
xlsxwriter: 0.9.2
lxml: 3.6.1
bs4: 4.4.1
html5lib: 0.999
httplib2: None
apiclient: None
sqlalchemy: 1.0.13
pymysql: 0.6.7.None
psycopg2: 2.5.4 (dt dec pq3 ext)
jinja2: 2.8
boto: 2.40.0
pandas_datareader: None
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4