A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/pandas-dev/pandas/issues/47910 below:

Allowing more control over the asserion printing format · Issue #47910 · pandas-dev/pandas · GitHub

Within my company we use pandas quite extensively, and with it we use the testing part for our unit tests, more specifically the assert_frame_equal however, for the type of data frames we use the output of a failing assertion is completely unreadable.

In [1]: import pandas as pd
    ...: from pandas.testing import assert_frame_equal
    ...: from datetime import datetime
    ...: import numpy as np
    ...: df1 = pd.DataFrame(np.ones((365,3)),index=pd.date_range(datetime(2022,1,1),datetime(2022,12,31)),columns=['a','b','c'])
    ...: df2 = df1.copy(deep=True)
    ...: df2.iloc[-1,0] = 0
    ...: assert_frame_equal(df1,df2)
AssertionError: DataFrame.iloc[:, 0] (column name="a") are different

DataFrame.iloc[:, 0] (column name="a") values are different (0.27397 %)
[index]: [2022-01-01T00:00:00.000000000, 2022-01-02T00:00:00.000000000, 2022-01-03T00:00:00.000000000, 2022-01-04T00:00:00.000000000, 2022-01-05T00:00:00.000000000, 2022-01-06T00:00:00.000000000, 2022-01-07T00:00:00.000000000, 2022-01-08T00:00:00.000000000, 2022-01-09T00:00:00.000000000, 2022-01-10T00:00:00.000000000, 2022-01-11T00:00:00.000000000, 2022-01-12T00:00:00.000000000, 2022-01-13T00:00:00.000000000, 2022-01-14T00:00:00.000000000, 2022-01-15T00:00:00.000000000, 2022-01-16T00:00:00.000000000, 2022-01-17T00:00:00.000000000, 2022-01-18T00:00:00.000000000, 2022-01-19T00:00:00.000000000, 2022-01-20T00:00:00.000000000, 2022-01-21T00:00:00.000000000, 2022-01-22T00:00:00.000000000, 2022-01-23T00:00:00.000000000, 2022-01-24T00:00:00.000000000, 2022-01-25T00:00:00.000000000, 2022-01-26T00:00:00.000000000, 2022-01-27T00:00:00.000000000, 2022-01-28T00:00:00.000000000, 2022-01-29T00:00:00.000000000, 2022-01-30T00:00:00.000000000, 2022-01-31T00:00:00.000000000, 2022-02-01T00:00:00.000000000, 2022-02-02T00:00:00.000000000, 2022-02-03T00:00:00.000000000, 2022-02-04T00:00:00.000000000, 2022-02-05T00:00:00.000000000, 2022-02-06T00:00:00.000000000, 2022-02-07T00:00:00.000000000, 2022-02-08T00:00:00.000000000, 2022-02-09T00:00:00.000000000, 2022-02-10T00:00:00.000000000, 2022-02-11T00:00:00.000000000, 2022-02-12T00:00:00.000000000, 2022-02-13T00:00:00.000000000, 2022-02-14T00:00:00.000000000, 2022-02-15T00:00:00.000000000, 2022-02-16T00:00:00.000000000, 2022-02-17T00:00:00.000000000, 2022-02-18T00:00:00.000000000, 2022-02-19T00:00:00.000000000, 2022-02-20T00:00:00.000000000, 2022-02-21T00:00:00.000000000, 2022-02-22T00:00:00.000000000, 2022-02-23T00:00:00.000000000, 2022-02-24T00:00:00.000000000, 2022-02-25T00:00:00.000000000, 2022-02-26T00:00:00.000000000, 2022-02-27T00:00:00.000000000, 2022-02-28T00:00:00.000000000, 2022-03-01T00:00:00.000000000, 2022-03-02T00:00:00.000000000, 2022-03-03T00:00:00.000000000, 2022-03-04T00:00:00.000000000, 2022-03-05T00:00:00.000000000, 2022-03-06T00:00:00.000000000, 2022-03-07T00:00:00.000000000, 2022-03-08T00:00:00.000000000, 2022-03-09T00:00:00.000000000, 2022-03-10T00:00:00.000000000, 2022-03-11T00:00:00.000000000, 2022-03-12T00:00:00.000000000, 2022-03-13T00:00:00.000000000, 2022-03-14T00:00:00.000000000, 2022-03-15T00:00:00.000000000, 2022-03-16T00:00:00.000000000, 2022-03-17T00:00:00.000000000, 2022-03-18T00:00:00.000000000, 2022-03-19T00:00:00.000000000, 2022-03-20T00:00:00.000000000, 2022-03-21T00:00:00.000000000, 2022-03-22T00:00:00.000000000, 2022-03-23T00:00:00.000000000, 2022-03-24T00:00:00.000000000, 2022-03-25T00:00:00.000000000, 2022-03-26T00:00:00.000000000, 2022-03-27T00:00:00.000000000, 2022-03-28T00:00:00.000000000, 2022-03-29T00:00:00.000000000, 2022-03-30T00:00:00.000000000, 2022-03-31T00:00:00.000000000, 2022-04-01T00:00:00.000000000, 2022-04-02T00:00:00.000000000, 2022-04-03T00:00:00.000000000, 2022-04-04T00:00:00.000000000, 2022-04-05T00:00:00.000000000, 2022-04-06T00:00:00.000000000, 2022-04-07T00:00:00.000000000, 2022-04-08T00:00:00.000000000, 2022-04-09T00:00:00.000000000, 2022-04-10T00:00:00.000000000, ...]
[left]:  [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, ...]
[right]: [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, ...]

There are three things that I can think of that could make this problem more manageable in order of my preference:

assert_frame_equal(df1,df2,display_diff_only=True)

index_diff: ['2022-12-31T00:00:00.000000000]
left: [1.0]
right: [0.0]

I was unable to find where the printing code in pandas was located for this so I can't provide actual code examples but it would not involve much more than only passing the difference between the series/dataframes to the function that prints the assertions instead of the full objects.
2. add a parameter that prints the output in column format rather than row.

assert_frame_equal(df1,df2,display_columnar=True)

index                               a           b
2022-01-01T00:00:00.000000000       1.0        1.0
2022-01-02T00:00:00.000000000       1.0        1.0
...
2022-12-31T00:00:00.000000000       0.0        1.0
assert_frame_equal(df1,df2,strftime="%d-%M-%Y")

index_diff: ['2022-01-01','2022-01-02',...]
left: [1.0,1.0,...]
right: [1.0,1.0,...]

As said, I couldn't find where these things live in the code base so I can't really provide implementation examples but they are simple/ similar enough to existing functionalities that I'm hoping my usage examples are clear enough.

The only other solution I'm aware of something akin to this answer: https://stackoverflow.com/a/72452894 which is to write manual loops to do the comparison yourself so you can output the diff.


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4