jsonhandler: allow reading yaml data from disk #254

trevormccasland · 2018-06-29T18:13:02Z

This commit aims to change the jsonhandler to be able to read
data from disk if it is a yaml file.

Note:

Simply replacing the loads call with yaml.safe_load is not enough
due to the nature of the NaN checker requiring an unsafe load[1].

closes #253
[1] 98adca6

This commit aims to change the jsonhandler to be able to read data from disk if it is a yaml file. Note: * Simply replacing the loads call with yaml.safe_load is not enough due to the nature of the NaN checker requiring an unsafe load[1]. closes cdent#253 [1] cdent@98adca6

cdent

Given how small this change is, I'm of two minds on how to proceed. Initially, I was thinking that it would make more sense as custom ContentHandler that replaced the existing JSONHandler: a subclass could override action. However, action ought to be rafactored to make that a bit easier with a method extracted for the data load.

I'm not sure. As usual, input appreciated from @FND and @jasonamyers. And you too @trevormccasland, of course.

cdent · 2018-07-01T15:24:38Z

docs/source/jsonpath.rst

@@ -76,11 +76,17 @@ or::
        $.pets: <@pets.json
        $.pets[0]: <@cat.json

+to use YAML files like the JSON ones above, they must be placed in a
+subdirectory::


may as well explain why

cdent · 2018-07-01T15:30:41Z

gabbi/handlers/jsonhandler.py

@@ -95,7 +96,10 @@ def action(self, test, path, value=None):
                rhs_path = test.replace_template('$' + rhs_path)
            info = test.load_data_file(value.replace('<@', '', 1))
            info = six.text_type(info, 'UTF-8')
-            value = self.loads(info)
+            if value.endswith('yaml'):
+                value = yaml.safe_load(info)


You say in the commit message that you can't use solely yaml.self_load() because of the NanChecker, but it is not clear how that is related to this.

Yeah good call, Looking at this again, I believe this has to do with a bug on how we test for isnan in tests/util.py we should cast the local var 'other' as a float because NaN is an IEEE 754 special value
(Pdb) other
'NaN'
(Pdb) math.isnan(other)
*** TypeError: a float is required

When making this change the test_unsafe_load tests were failing due to the NaNChecker not casting it's input argument as a float * (Pdb) other 'NaN' (Pdb) math.isnan(other) *** TypeError: a float is required

trevormccasland · 2018-07-02T16:10:28Z

Next patch will have the new ContentHandler subclassing JSONHandler and overriding the loads method and hopefully a better fix for the NaN issue. Do we want to fail if the input is a string? I believe the gnocchi tests are telling me yes, so maybe its not an issue afterall.

trevormccasland · 2018-07-03T16:40:33Z

So I think I see why the test case is failing now, it appears a float('NaN') dumped by json converts to a string when loaded in yaml.

myd = {'a': float('NaN')}
j = json.dumps(myd)
y = yaml.dump(myd)
y
'{a: .nan}\n'
j
'{"a": NaN}'
yaml.version
'3.12'
yaml.safe_load(j)
{'a': 'NaN'}
yaml.safe_load(y)
{'a': nan}

I'm still working on the YAMLHandler. I'm refactoring the ContentHandler base class to provide a load_data_file method that returns python data to get around this issue.

cdent · 2018-07-03T18:53:18Z

Yeah, it's pretty clear that despite the rumors, at least in Python, JSON is not a strict subset of YAML and you can't load it with complete accuracy. I think this makes it a pretty clear that using as the loader needs to be something that is opt-in, and the way we do that with ContentHandlers is by having others.

I agree that having an overridable load_<something> method of some kind is a good way to go.

Will look through the current patches to see what else is up. Thanks for plugging away at this.

cdent · 2018-07-03T18:57:53Z

gabbi/handlers/jsonhandler.py

@@ -54,7 +55,7 @@ def dumps(data, pretty=False, test=None):

    @staticmethod
    def loads(data):
-        return json.loads(data)
+        return yaml.safe_load(data)


The root of the speedbumps you've hit is because you're engaging yaml.safe_load here, instead of in the action method, meaning yaml gets involved in all the JSON processing, not just when you are loading data from disk to be handled in a response_json_paths.

This might be the right thing, but did you consider doing it in action and keeping the behaviors separated?

This patch aims to allow loading yaml files with the use of a custom handler named YAMLHandler. NOTE: * modifications to suitemaker.py and case.py were needed to delegate the handlers to their respective collections because in case.py the _assert_response method will iterate over all response_handlers which includes every handle (default, response_handlers, and content_handlers) because of how they were added when the tests were made. This caused the test_yaml_handler tests to fail because both JSONHandler and YAMLHandler were called on the test.

cdent

The two main issues now are

the name: YAMLHandler implies more than what the thing created does (see more within)
the adjustments to deal with multiples handlers that are of the same type are likely not in the right place, see the comments within

Thanks again for continuing to plug away at this. The overall concept of a load_data_file file is good. I'll try to find some time to look into how to manage the overlapping content handlers

cdent · 2018-07-04T14:12:27Z

docs/source/handlers.rst

@@ -22,6 +22,12 @@ string into a data structure in ``response_data`` that is used when
 evaluating ``response_json_paths`` entries in a test or doing
 JSONPath-based ``$RESPONSE[]`` substitutions.

+A YAML content handler has been added to extend the JSON handler.


We need to call it something different than the YAML content handler because what the name YAMLContentHandler means is that this is a handler for request and/or response bodies which are YAML. That's not what this is.

Naming is hard, so I don't have any clever ideas at the moment but will think on it.

I went with the ugly but explicit route :/ YAMLDiskLoadingJSONHandler . Alternatives could be YDLJSONHandler or YDLJHandler for short? Any other suggestions are welcome too.

cdent · 2018-07-04T14:17:35Z

gabbi/driver.py

@@ -105,7 +105,6 @@ def build_tests(path, loader, host=None, port=8001, intercept=None,
    for handler in (content_handlers + response_handlers +
                    handlers.RESPONSE_HANDLERS):
        handler_objects.append(handler())
-


cdent · 2018-07-04T14:29:55Z

gabbi/case.py

+        handlers = self.response_handlers
+        content_handler = self.get_content_handler(self.content_type)
+        if content_handler:
+            handlers.append(content_handler)


This isn't the right fix for this. Unfortunately, we're going to need to dig a bit to figure out what the right fix is. The problem will be happening somewhere in the _register handling in the base ResponseHandler.

Part of the reason this is wrong is that this method is supposed to be using response handlers only (of which content handlers are a subclass, but not all responsehandlers are content handlers) so if you have to go looking in get_content_handler than a handler has not _registered properly, or has been clobbered somehow.

I haven't got the cycles to dig into it today.

I added a class attribute to help with this.

cdent · 2018-07-04T14:30:19Z

gabbi/handlers/base.py

@@ -12,7 +12,6 @@
 # under the License.
 """Base classes for response and content handlers."""

-


unreleased (although this is a fine tidy up)

cdent · 2018-07-04T14:32:07Z

gabbi/suitemaker.py

        if handler.content_handler:
            content_handlers.append(handler.content_handler)
+        elif handler.response_handler:
+            response_handlers.append(handler.response_handler)


again this is probably not the right fix, as when a handler registers it can declare itself as both a content handler and a response handler and they are not mutually exclusive. For some things (notably the JSONHandler) it is both.

I added a class attribute so a handler can declare itself as just a content handler.

I noticed this change hasn't been reverted yet. Looks like there is still some more work to do on this part.

I've got a branch in progress that tries to address that in a slightly different way. We'll want to compare notes on that and decide which is best. I'll push it up shortly.

ok I'll wait on that then. I found there was a typo in jsonhandler.py too that this was covering up.
add_reponse_handling -> add_response_handling

This commit aims to add a class attribute called add_response_handling to the ResponseHandler. It's purpose is to allow a ContentHandler to exclude itself from the response_handler collection so that more than one ContentHandler with the same content-type can run tests correctly. YAMLHandler was renamed to something pretty long, I think we can use YDL as an acronym for YAMLDiskLoading if we don't like the length.

trevormccasland · 2018-07-06T16:43:16Z

docs/source/handlers.rst

@@ -149,4 +153,4 @@ passed the ``response_data`` of the prior test and the argument within the
 Please see the `JSONHandler source`_ for additional detail.

 .. _JSONHandler source: https://github.com/cdent/gabbi/blob/master/gabbi/handlers/jsonhandler.py
-.. _YAMLHandler tests: https://github.com/cdent/gabbi/blob/master/gabbi/tests/test_yaml_handler.py
+.. _YAMLDiskLoadingJSONHandler tests: https://github.com/cdent/gabbi/blob/master/gabbi/tests/test_yaml_disk_loading_jsonhandler.py


I don't see this being used anywhere :/

trevormccasland · 2018-07-06T16:43:55Z

docs/source/jsonpath.rst


    response_json_paths:
        $: @<subdir/values.yaml

-Examples like this can be found in one of gabbi's `own tests`_.
-
 When reading from disk you can apply the same JSONPath by adding a ':' to the
 end of your file name. This allows you to store multiple API responses into
 a JSON or YAML file to reduce file management when constructing your tests.


this may just have to read as '...multiple API responses into a JSON file...'

cdent · 2018-07-08T20:41:59Z

Writing just to ack that I've had a look at this, but will need to look more closely tomorrow to more fully understand the added class attribute.

Also, whatever is going on with the failing pytest-driven tests needs to be addressed.

Thanks for plugging away.

cdent · 2018-07-09T19:04:57Z

I'm going to try some experiments myself. I think one of the issues here is that the new subclassed handler should replace the json handler entirely, but I'm struggling to trace the logic in my brain so I need to hack on the code. I'll post back here when I've had some more concrete thoughts.

Tests with the same filename were giving weird connection errors when running the tests[1] [1] https://travis-ci.org/cdent/gabbi/jobs/400962655

cdent · 2018-07-09T20:51:09Z

That need for different named flags is almost certainly a bug in the pytest handling. Good find. I'll try to fix that.

This is an alternate version of pull #254, with the difference in gabbi/suitemaker.py where response_handlers is only extended if the response_* key has not been seen before. This means that if the YAMLDiskLoadingJSONHandler is turned on it does all the work of the JSONHandler for respone handling, because it is only present once. At the same time, the existing handing for content handlers means that custom handlers which accept a content-type are used first (and only) for the content type it accepts. The issues with pytest and files having the same name remain here. I tried to make some change to pytest test loading to make the global cache use more distinguishing keys, but it was distracting from progressing on the other work, so I'll deal with that separately.

this patch aims to integrate the changes made in[1] revert changes related to the class attribute 'add_response_handler' in[2] and add unit tests to cover the suitemaker logic from[1]. [1]cdent#256 [2]cdent@241684e

cdent · 2018-07-16T09:19:52Z

gabbi/suitemaker.py

    for handler in handlers:
        default_test_dict.update(handler.test_base)
+        if handler.response_handler:
+            key = 'response_%s' % handler.test_key_suffix


since key is getting thrown away outside this look, constructing 'key' from 'response_%s' is probably not necessary (as it will be done for every suffix). Should be okay to use handler.test_key_suffix directly in the set.

cdent · 2018-07-16T09:21:17Z

This is looking good. I added one more comment on a small thing that could be fixed to make it a bit more futureproof.

Thanks!

uses handler's test_key_suffix property rather than trying to recreate its private _key member's value.

cdent · 2018-07-16T20:00:11Z

This is released as 1.44.0

@trevormccasland If you get the opportunity to write a blog post or something like that about how you are using gabbi, especially the features you added, that would be great. Since people rarely read the docs all the way through, only something that does a bit of "we wanted to do this, so we added these features" kind of explanation will really get the idea across.

Thanks again.

trevormccasland · 2018-07-17T13:32:03Z

Thanks a lot for the reviews and working through the pull request with me. I will work on a blog post as soon as we get everything together on our side. Then I'll post the link here when it's ready.

cdent requested changes Jul 1, 2018

View reviewed changes

Use yaml.safe_load in jsonhandler's load call

c8fc6b3

When making this change the test_unsafe_load tests were failing due to the NaNChecker not casting it's input argument as a float * (Pdb) other 'NaN' (Pdb) math.isnan(other) *** TypeError: a float is required

cdent reviewed Jul 3, 2018

View reviewed changes

cdent requested changes Jul 4, 2018

View reviewed changes

trevormccasland commented Jul 6, 2018

View reviewed changes

pytest fix: rename test files

7f79faf

Tests with the same filename were giving weird connection errors when running the tests[1] [1] https://travis-ci.org/cdent/gabbi/jobs/400962655

cdent mentioned this pull request Jul 9, 2018

WIP: alternate mode for repeating response handlers #256

Closed

integrate repeating response handlers

03476e6

this patch aims to integrate the changes made in[1] revert changes related to the class attribute 'add_response_handler' in[2] and add unit tests to cover the suitemaker logic from[1]. [1]cdent#256 [2]cdent@241684e

cdent reviewed Jul 16, 2018

View reviewed changes

use test_key_suffix instead of making key

cc9d975

uses handler's test_key_suffix property rather than trying to recreate its private _key member's value.

cdent merged commit bc20dac into cdent:master Jul 16, 2018

trevormccasland deleted the yaml-from-disk branch July 16, 2018 21:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

jsonhandler: allow reading yaml data from disk #254

jsonhandler: allow reading yaml data from disk #254

trevormccasland commented Jun 29, 2018

cdent left a comment

cdent Jul 1, 2018

trevormccasland Jul 2, 2018

cdent Jul 1, 2018

trevormccasland Jul 2, 2018

trevormccasland commented Jul 2, 2018

trevormccasland commented Jul 3, 2018

cdent commented Jul 3, 2018

cdent Jul 3, 2018

cdent left a comment

cdent Jul 4, 2018

trevormccasland Jul 6, 2018

cdent Jul 4, 2018

cdent Jul 4, 2018

trevormccasland Jul 6, 2018

cdent Jul 4, 2018

cdent Jul 4, 2018

trevormccasland Jul 6, 2018

trevormccasland Jul 9, 2018

cdent Jul 9, 2018

trevormccasland Jul 9, 2018 •

edited

Loading

trevormccasland Jul 6, 2018

trevormccasland Jul 6, 2018

cdent commented Jul 8, 2018

cdent commented Jul 9, 2018

cdent commented Jul 9, 2018

cdent Jul 16, 2018

cdent commented Jul 16, 2018 •

edited

Loading

cdent commented Jul 16, 2018 •

edited

Loading

trevormccasland commented Jul 17, 2018

		@@ -12,7 +12,6 @@
		# under the License.
		"""Base classes for response and content handlers."""

jsonhandler: allow reading yaml data from disk #254

jsonhandler: allow reading yaml data from disk #254

Conversation

trevormccasland commented Jun 29, 2018

cdent left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

trevormccasland commented Jul 2, 2018

trevormccasland commented Jul 3, 2018

cdent commented Jul 3, 2018

Choose a reason for hiding this comment

cdent left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

trevormccasland Jul 9, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cdent commented Jul 8, 2018

cdent commented Jul 9, 2018

cdent commented Jul 9, 2018

Choose a reason for hiding this comment

cdent commented Jul 16, 2018 • edited Loading

cdent commented Jul 16, 2018 • edited Loading

trevormccasland commented Jul 17, 2018

trevormccasland Jul 9, 2018 •

edited

Loading

cdent commented Jul 16, 2018 •

edited

Loading

cdent commented Jul 16, 2018 •

edited

Loading