A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/pandas-dev/pandas/issues/47880 below:

parse 8 or 9 digit delimited dates · Issue #47880 · pandas-dev/pandas · GitHub

Feature Type Problem Description

Currently, a string such as 01-01-2020 is parsed as a delimited date, whereas 1-1-2020 is parsed by dateutil

One consequence of this is that warnings about e.g. dayfirst aren't shown in the latter case, e.g.:

>>> pd.to_datetime(['13-01-2020'], dayfirst=False)
<stdin>:1: UserWarning: Parsing dates in DD/MM/YYYY format when dayfirst=False (the default) was specified. This may lead to inconsistently parsed dates! Specify a format to ensure consistent parsing.
DatetimeIndex(['2020-01-13'], dtype='datetime64[ns]', freq=None)
>>> pd.to_datetime(['13-1-2020'], dayfirst=False)
DatetimeIndex(['2020-01-13'], dtype='datetime64[ns]', freq=None)
Feature Description

In

cdef inline object _parse_delimited_date(str date_string, bint dayfirst): """ Parse special cases of dates: MM/DD/YYYY, DD/MM/YYYY, MM/YYYY. At the beginning function tries to parse date in MM/DD/YYYY format, but if month > 12 - in DD/MM/YYYY (`dayfirst == False`). With `dayfirst == True` function makes an attempt to parse date in DD/MM/YYYY, if an attempt is wrong - in DD/MM/YYYY For MM/DD/YYYY, DD/MM/YYYY: delimiter can be a space or one of /-. For MM/YYYY: delimiter can be a space or one of /- If `date_string` can't be converted to date, then function returns None, None Parameters ---------- date_string : str dayfirst : bool Returns: -------- datetime or None str or None Describing resolution of the parsed string. """ cdef: const char* buf Py_ssize_t length int day = 1, month = 1, year bint can_swap = 0 buf = get_c_string_buf_and_size(date_string, &length) if length == 10: # parsing MM?DD?YYYY and DD?MM?YYYY dates if _is_not_delimiter(buf[2]) or _is_not_delimiter(buf[5]): return None, None month = _parse_2digit(buf) day = _parse_2digit(buf + 3) year = _parse_4digit(buf + 6) reso = 'day' can_swap = 1 elif length == 7: # parsing MM?YYYY dates if buf[2] == b'.' or _is_not_delimiter(buf[2]): # we cannot reliably tell whether e.g. 10.2010 is a float # or a date, thus we refuse to parse it here return None, None month = _parse_2digit(buf) year = _parse_4digit(buf + 3) reso = 'month' else: return None, None

some code could be added to deal with cases where buf is of length 8 or 9, and either the date or the month are of length 1

Alternative Solutions

Always warn when using dateutil, but I don't a warning should be necessary here

Additional Context

If we wanted to warn whenever dateutil is called (e.g. #47828), then this'd really simplify the adjustments necessary to the test suite, as a lot of tests could be kept as they are


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4