Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update standardizer.py #944

Merged
merged 10 commits into from
Oct 18, 2024
Merged

Update standardizer.py #944

merged 10 commits into from
Oct 18, 2024

Conversation

jonfritz
Copy link
Contributor

Adding a new class called ignore_errors to enable the standardizers to gracefully continue when encountering null key:value pairs

Adding a new class called ignore_errors to enable the standardizers to gracefully continue when encountering null key:value pairs
@@ -283,3 +283,19 @@ def __init__(
**kwargs,
):
super().__init__(child, f=standardizer.standardize, args=path, kwargs=kwargs)

class IgnoreErrors(standardizer)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't quite the right syntax, so I don't think this will work.

@bsowell My guess is that we would like to have this logic be more generic, perhaps a subclass of Map that swallows errors from the function being applied?

Copy link
Collaborator

@HenryL27 HenryL27 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worth just making this a function rather than a full class - the main benefit you get from having classes with single static methods is that you can do interfaces and stuff. I don't imagine there will be any other things that follow the same contract as IgnoreErrors so it should probably just be a function (useable like docset.map(lambda d: ignore_errors(d, standardizer, key_path))).

If you want to be really fancy (and actually I think this would be a good interface) you can make this a higher-order function - i.e. have it return a function that calls the standardizer and swallows errors. Then the usage can become docset.map(ignore_errors(standardizer, key_path)).

Update with Henry's feedback
Adding unit tests for ignore_errors
Update helper text
Copy link
Collaborator

@HenryL27 HenryL27 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, just get the linting to work

@jonfritz jonfritz merged commit dd17098 into main Oct 18, 2024
12 of 13 checks passed
@jonfritz jonfritz deleted the jonfritz-patch-standardizer branch October 18, 2024 22:01
@HenryL27 HenryL27 mentioned this pull request Oct 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants