Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: Add entrypoint for plotting #27488

Merged
merged 11 commits into from
Jul 25, 2019

Conversation

TomAugspurger
Copy link
Contributor

Libraries, including pandas, register backends via entrypoints.

xref #26747

cc @datapythonista @jakevdp @philippjfr

Libraries, including pandas, register backends via entrypoints.

xref pandas-dev#26747
@TomAugspurger TomAugspurger added the Visualization plotting label Jul 20, 2019
@TomAugspurger TomAugspurger added this to the 0.25.1 milestone Jul 20, 2019
Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of comments.

@jreback
Copy link
Contributor

jreback commented Jul 22, 2019

this seems like a very complex solution

why is not the simpler

plotting.backend = ‘altair.pandas_plot’
plotting.backend = ‘pandas._matplotlib.plot’

not much more straightforward here?

eg this is a standard import as passed to importlib

@TomAugspurger
Copy link
Contributor Author

We want to decouple the backend name (altair) from the implementation. Altair doesn't want and shouldn't need a top-level altair.pandas_plot just to implement this backend (which may even be implemented in a separate package).

@jreback
Copy link
Contributor

jreback commented Jul 22, 2019

and how does my suggestion not do this is? it is fully decoupled

@TomAugspurger
Copy link
Contributor Author

I guess I don't understand your suggestion then. You say

plotting.backend = ‘altair.pandas_plot’

which seems to require a top-level altair.pandas_plot?

@jreback
Copy link
Contributor

jreback commented Jul 22, 2019

not at all
these are examples of what you could pass; it could be called anything
that’s the point - we don’t care what it’s called
and no pandas code would have to change for this to be added

@jakevdp
Copy link
Contributor

jakevdp commented Jul 22, 2019

The reason I proposed this is because I believe it's cleaner for the implementation's code path to be separate from how the user specifies it. My goal would be for a user to write

plotting.backend = 'altair'

and for this to cause the plotting backend to use Altair.

Before this solution, that would require me to pollute the Altair package's top-level namespace with APIs that are irrelevant to users of the package. I think entrypoints is a clean solution to that problem (decoupling of specification from code paths), and one that is in common use among packages in the jupyter ecosystem.

@jakevdp
Copy link
Contributor

jakevdp commented Jul 22, 2019

I just want to avoid a situation where users have to remember, "for Matplotlib use plotting.backend = 'matplotlib'", "for Bokeh, use plotting.backend = 'bokeh._pandas_api", for Altair use plotting.backend = 'altair_pandas_backend'"...

Seems like it would lead to more confusion than necessary.

@TomAugspurger
Copy link
Contributor Author

Thanks @jakevdp. Agreed entirely.

@jreback
Copy link
Contributor

jreback commented Jul 22, 2019

ok I see your point @jakevdp and I agree that is nice to de-couple the user supplied key to plotting.backend, but then again it does couple the api of the package to pandas directly, meaning we then need to make an update in pandas to actually use a new package / api.

So this is nice, I would then also allow a fully qualified module.function call to succeed

@TomAugspurger
Copy link
Contributor Author

it does couple the api of the package to pandas directly, meaning we then need to make an update in pandas to actually use a new package / api.

I may be misunderstanding, but I don't see how the change to use entrypoints affects that. If you're saying that we'll need to update pandas so that set_option('plotting.backend', 'altair') works, then that's not correct. The only change required after this is that altair (or pdvega, whatever package) adds an EntryPoint for pandas_plotting_backends in its setup.py.

Basically, pandas adds an EntryPoint "group" in our setup.py. And third party libraries add items to that group in their setup.pys. Pandas and the library don't need to talk to each other directly to register a backend. We talk through pkg_resources.

So this is nice, I would then also allow a fully qualified module.function call to succeed

We do allow that (see test_register_import).

@jreback
Copy link
Contributor

jreback commented Jul 22, 2019

Basically, pandas adds an EntryPoint "group" in our setup.py. And third party libraries add items to that group in their setup.pys. Pandas and the library don't need to talk to each other directly to register a backend. We talk through pkg_resources.

ok so if this works then that's great. +1.

@TomAugspurger
Copy link
Contributor Author

Yep, this should work out just fine.

@jreback
Copy link
Contributor

jreback commented Jul 24, 2019

@datapythonista ok on this?

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, just couple of small things in the docstring.

@TomAugspurger
Copy link
Contributor Author

Thanks, fixed.

@jreback jreback merged commit e9a60bb into pandas-dev:master Jul 25, 2019
@jreback
Copy link
Contributor

jreback commented Jul 25, 2019

thanks @TomAugspurger



@td.skip_if_no_mpl
def test_register_entrypoint():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is failing for me locally

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are you seeing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I'm guessing it's a KeyError: 'pandas_plotting_backends'?

You'll need to re-run python -m pip install -e . in your pandas directory. This adds the entrypoint.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep that does fix it, thanks

@TomAugspurger TomAugspurger deleted the plotting-entrypoints branch July 25, 2019 20:01
quintusdias pushed a commit to quintusdias/pandas_dev that referenced this pull request Aug 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants