-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
jsonhandler: allow reading yaml data from disk #254
Conversation
This commit aims to change the jsonhandler to be able to read data from disk if it is a yaml file. Note: * Simply replacing the loads call with yaml.safe_load is not enough due to the nature of the NaN checker requiring an unsafe load[1]. closes cdent#253 [1] cdent@98adca6
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given how small this change is, I'm of two minds on how to proceed. Initially, I was thinking that it would make more sense as custom ContentHandler that replaced the existing JSONHandler: a subclass could override action
. However, action ought to be rafactored to make that a bit easier with a method extracted for the data load.
I'm not sure. As usual, input appreciated from @FND and @jasonamyers. And you too @trevormccasland, of course.
docs/source/jsonpath.rst
Outdated
@@ -76,11 +76,17 @@ or:: | |||
$.pets: <@pets.json | |||
$.pets[0]: <@cat.json | |||
|
|||
to use YAML files like the JSON ones above, they must be placed in a | |||
subdirectory:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
may as well explain why
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do.
gabbi/handlers/jsonhandler.py
Outdated
@@ -95,7 +96,10 @@ def action(self, test, path, value=None): | |||
rhs_path = test.replace_template('$' + rhs_path) | |||
info = test.load_data_file(value.replace('<@', '', 1)) | |||
info = six.text_type(info, 'UTF-8') | |||
value = self.loads(info) | |||
if value.endswith('yaml'): | |||
value = yaml.safe_load(info) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You say in the commit message that you can't use solely yaml.self_load()
because of the NanChecker, but it is not clear how that is related to this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah good call, Looking at this again, I believe this has to do with a bug on how we test for isnan in tests/util.py we should cast the local var 'other' as a float because NaN is an IEEE 754 special value
(Pdb) other
'NaN'
(Pdb) math.isnan(other)
*** TypeError: a float is required
When making this change the test_unsafe_load tests were failing due to the NaNChecker not casting it's input argument as a float * (Pdb) other 'NaN' (Pdb) math.isnan(other) *** TypeError: a float is required
Next patch will have the new ContentHandler subclassing JSONHandler and overriding the loads method and hopefully a better fix for the NaN issue. Do we want to fail if the input is a string? I believe the gnocchi tests are telling me yes, so maybe its not an issue afterall. |
So I think I see why the test case is failing now, it appears a float('NaN') dumped by json converts to a string when loaded in yaml.
I'm still working on the YAMLHandler. I'm refactoring the ContentHandler base class to provide a load_data_file method that returns python data to get around this issue. |
Yeah, it's pretty clear that despite the rumors, at least in Python, JSON is not a strict subset of YAML and you can't load it with complete accuracy. I think this makes it a pretty clear that using as the loader needs to be something that is opt-in, and the way we do that with ContentHandlers is by having others. I agree that having an overridable Will look through the current patches to see what else is up. Thanks for plugging away at this. |
gabbi/handlers/jsonhandler.py
Outdated
@@ -54,7 +55,7 @@ def dumps(data, pretty=False, test=None): | |||
|
|||
@staticmethod | |||
def loads(data): | |||
return json.loads(data) | |||
return yaml.safe_load(data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The root of the speedbumps you've hit is because you're engaging yaml.safe_load
here, instead of in the action
method, meaning yaml gets involved in all the JSON processing, not just when you are loading data from disk to be handled in a response_json_paths
.
This might be the right thing, but did you consider doing it in action
and keeping the behaviors separated?
This patch aims to allow loading yaml files with the use of a custom handler named YAMLHandler. NOTE: * modifications to suitemaker.py and case.py were needed to delegate the handlers to their respective collections because in case.py the _assert_response method will iterate over all response_handlers which includes every handle (default, response_handlers, and content_handlers) because of how they were added when the tests were made. This caused the test_yaml_handler tests to fail because both JSONHandler and YAMLHandler were called on the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The two main issues now are
- the name: YAMLHandler implies more than what the thing created does (see more within)
- the adjustments to deal with multiples handlers that are of the same type are likely not in the right place, see the comments within
Thanks again for continuing to plug away at this. The overall concept of a load_data_file file is good. I'll try to find some time to look into how to manage the overlapping content handlers
docs/source/handlers.rst
Outdated
@@ -22,6 +22,12 @@ string into a data structure in ``response_data`` that is used when | |||
evaluating ``response_json_paths`` entries in a test or doing | |||
JSONPath-based ``$RESPONSE[]`` substitutions. | |||
|
|||
A YAML content handler has been added to extend the JSON handler. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to call it something different than the YAML content handler because what the name YAMLContentHandler
means is that this is a handler for request and/or response bodies which are YAML. That's not what this is.
Naming is hard, so I don't have any clever ideas at the moment but will think on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went with the ugly but explicit route :/ YAMLDiskLoadingJSONHandler . Alternatives could be YDLJSONHandler or YDLJHandler for short? Any other suggestions are welcome too.
gabbi/driver.py
Outdated
@@ -105,7 +105,6 @@ def build_tests(path, loader, host=None, port=8001, intercept=None, | |||
for handler in (content_handlers + response_handlers + | |||
handlers.RESPONSE_HANDLERS): | |||
handler_objects.append(handler()) | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unrelated?
gabbi/case.py
Outdated
handlers = self.response_handlers | ||
content_handler = self.get_content_handler(self.content_type) | ||
if content_handler: | ||
handlers.append(content_handler) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't the right fix for this. Unfortunately, we're going to need to dig a bit to figure out what the right fix is. The problem will be happening somewhere in the _register
handling in the base ResponseHandler
.
Part of the reason this is wrong is that this method is supposed to be using response handlers only (of which content handlers are a subclass, but not all responsehandlers are content handlers) so if you have to go looking in get_content_handler than a handler has not _registered properly, or has been clobbered somehow.
I haven't got the cycles to dig into it today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a class attribute to help with this.
@@ -12,7 +12,6 @@ | |||
# under the License. | |||
"""Base classes for response and content handlers.""" | |||
|
|||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unreleased (although this is a fine tidy up)
gabbi/suitemaker.py
Outdated
if handler.content_handler: | ||
content_handlers.append(handler.content_handler) | ||
elif handler.response_handler: | ||
response_handlers.append(handler.response_handler) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again this is probably not the right fix, as when a handler registers it can declare itself as both a content handler and a response handler and they are not mutually exclusive. For some things (notably the JSONHandler) it is both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a class attribute so a handler can declare itself as just a content handler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed this change hasn't been reverted yet. Looks like there is still some more work to do on this part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've got a branch in progress that tries to address that in a slightly different way. We'll want to compare notes on that and decide which is best. I'll push it up shortly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok I'll wait on that then. I found there was a typo in jsonhandler.py too that this was covering up.
add_reponse_handling -> add_response_handling
This commit aims to add a class attribute called add_response_handling to the ResponseHandler. It's purpose is to allow a ContentHandler to exclude itself from the response_handler collection so that more than one ContentHandler with the same content-type can run tests correctly. YAMLHandler was renamed to something pretty long, I think we can use YDL as an acronym for YAMLDiskLoading if we don't like the length.
docs/source/handlers.rst
Outdated
@@ -149,4 +153,4 @@ passed the ``response_data`` of the prior test and the argument within the | |||
Please see the `JSONHandler source`_ for additional detail. | |||
|
|||
.. _JSONHandler source: https://github.com/cdent/gabbi/blob/master/gabbi/handlers/jsonhandler.py | |||
.. _YAMLHandler tests: https://github.com/cdent/gabbi/blob/master/gabbi/tests/test_yaml_handler.py | |||
.. _YAMLDiskLoadingJSONHandler tests: https://github.com/cdent/gabbi/blob/master/gabbi/tests/test_yaml_disk_loading_jsonhandler.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see this being used anywhere :/
docs/source/jsonpath.rst
Outdated
|
||
response_json_paths: | ||
$: @<subdir/values.yaml | ||
|
||
Examples like this can be found in one of gabbi's `own tests`_. | ||
|
||
When reading from disk you can apply the same JSONPath by adding a ':' to the | ||
end of your file name. This allows you to store multiple API responses into | ||
a JSON or YAML file to reduce file management when constructing your tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this may just have to read as '...multiple API responses into a JSON file...'
Writing just to ack that I've had a look at this, but will need to look more closely tomorrow to more fully understand the added class attribute. Also, whatever is going on with the failing pytest-driven tests needs to be addressed. Thanks for plugging away. |
I'm going to try some experiments myself. I think one of the issues here is that the new subclassed handler should replace the json handler entirely, but I'm struggling to trace the logic in my brain so I need to hack on the code. I'll post back here when I've had some more concrete thoughts. |
Tests with the same filename were giving weird connection errors when running the tests[1] [1] https://travis-ci.org/cdent/gabbi/jobs/400962655
That need for different named flags is almost certainly a bug in the pytest handling. Good find. I'll try to fix that. |
This is an alternate version of pull #254, with the difference in gabbi/suitemaker.py where response_handlers is only extended if the response_* key has not been seen before. This means that if the YAMLDiskLoadingJSONHandler is turned on it does all the work of the JSONHandler for respone handling, because it is only present once. At the same time, the existing handing for content handlers means that custom handlers which accept a content-type are used first (and only) for the content type it accepts. The issues with pytest and files having the same name remain here. I tried to make some change to pytest test loading to make the global cache use more distinguishing keys, but it was distracting from progressing on the other work, so I'll deal with that separately.
this patch aims to integrate the changes made in[1] revert changes related to the class attribute 'add_response_handler' in[2] and add unit tests to cover the suitemaker logic from[1]. [1]cdent#256 [2]cdent@241684e
gabbi/suitemaker.py
Outdated
for handler in handlers: | ||
default_test_dict.update(handler.test_base) | ||
if handler.response_handler: | ||
key = 'response_%s' % handler.test_key_suffix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since key is getting thrown away outside this look, constructing 'key' from 'response_%s' is probably not necessary (as it will be done for every suffix). Should be okay to use handler.test_key_suffix
directly in the set.
This is looking good. I added one more comment on a small thing that could be fixed to make it a bit more futureproof. Thanks! |
uses handler's test_key_suffix property rather than trying to recreate its private _key member's value.
This is released as 1.44.0 @trevormccasland If you get the opportunity to write a blog post or something like that about how you are using gabbi, especially the features you added, that would be great. Since people rarely read the docs all the way through, only something that does a bit of "we wanted to do this, so we added these features" kind of explanation will really get the idea across. Thanks again. |
Thanks a lot for the reviews and working through the pull request with me. I will work on a blog post as soon as we get everything together on our side. Then I'll post the link here when it's ready. |
This commit aims to change the jsonhandler to be able to read
data from disk if it is a yaml file.
Note:
due to the nature of the NaN checker requiring an unsafe load[1].
closes #253
[1] 98adca6