Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: groupby(Grouper) with all-NaT grouping keys #43486

Open
3 tasks done
jbrockmendel opened this issue Sep 9, 2021 · 4 comments
Open
3 tasks done

BUG: groupby(Grouper) with all-NaT grouping keys #43486

jbrockmendel opened this issue Sep 9, 2021 · 4 comments
Labels
Bug Datetime Datetime data dtype Groupby Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate

Comments

@jbrockmendel
Copy link
Member

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the master branch of pandas.

Reproducible Example

import pandas as pd

df = pd.DataFrame({"date": [pd.NaT], "value": [11]})
grouper = pd.Grouper(freq="M", key="date")

>>> gb = df.groupby(grouper)
[...]
  File "pandas/core/resample.py", line 1878, in _get_timestamp_range_edges
    first = first.normalize()
AttributeError: 'NaTType' object has no attribute 'normalize'


### Issue Description

Example derived from test_timegrouper_apply_return_type_series


### Expected Behavior

Either not raise or raise at construction of the Grouper object with a useful exception message.

### Installed Versions

master
@jbrockmendel jbrockmendel added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Sep 9, 2021
@mroeschke mroeschke added Groupby Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Datetime Datetime data dtype and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Sep 30, 2021
@WKKO
Copy link

WKKO commented Apr 10, 2022

Sorry to bother you. Do you know the solution to this problem? I've been in trouble recently

@burnpanck
Copy link
Contributor

I believe this is a manifestation of #24983: With the current design, pd.NaT isn't a timestamp at all; it ambiguously refers to either an undefined timedelta or an undefined timestamp. You might get lucky by explicitly making the "date" column a timestamp, but the grouper might nonetheless fail once it reaches the nat time.

@jamsden
Copy link

jamsden commented Nov 17, 2022

Any workaround identified? i.e., could I test for all NaT values and return an empty Series?

@kfchou
Copy link

kfchou commented Nov 30, 2022

Since it wouldn't make sense to count or add anything in this situation, here's my workaround:

    if all(df['key'].isna()):
        df['key'] = pd.DataFrame(columns=df.columns)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Datetime Datetime data dtype Groupby Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate
Projects
None yet
Development

No branches or pull requests

6 participants