Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Json schema #526

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open

Json schema #526

wants to merge 6 commits into from

Conversation

carlhiggs
Copy link
Collaborator

This pull request addresses #506, introducing

  1. a JSON Schema definition (region-json-schema.json) against which we can validate our example configuration file (and later, user configuration files) based on the 2020-12 draft specification
  2. a test that the example configuration matches the region json schema

Based on my local interactive tests our current example configuration file appears to validate against the region JSON schema, however the inclusion of the test is to ensure that this remains the case in our continuous integration workflow.

There is much more to do, but this will ultimately involve breaking changes to region configuration files. For example, consistent formatting of dates, and metadata for data objects.

But for now, we now have a draft JSON schema configuration file definition that validates, and this is something that we can build from. This addresses #506 in a most basic way; however, let's leave that issue open until we introduce checking of user configurations.

Checking of user configurations will require more rigorous thinking through of what are required vs optional fields. Further, changes to configuration formatting requirements will cascade to code changes (hopefully, simplifications) to accommodate these. So, this is all part of a bigger task.

…ed schema against the reference standard, and the example configuration against the updated version successfully using jsonschema) towards #506
…g schema, towards #506; example configuration now validates (although, it is pending improvement and standardisation, eg of dates --- but that will involve breaking changes, so is a separate bigger task
@carlhiggs carlhiggs requested a review from shiqin-liu March 18, 2025 06:04
…ions (i.e. that null is correct value, not 0); towards #506
@carlhiggs carlhiggs removed the request for review from shiqin-liu March 18, 2025 06:27
@carlhiggs
Copy link
Collaborator Author

there are still some issues... that's why there are tests...

…e yaml constructuor modificatino for parsing dates and keys as strings; hopefully test passes in workflow now (does locally)
@carlhiggs
Copy link
Collaborator Author

carlhiggs commented Mar 18, 2025

So, the test for valid yaml schema appears to check out fine

test_0_2_schema_yaml (tests.tests.tests.test_0_2_schema_yaml)
Check if example configuration file is valid against jsonschema file. ... ok

----- but somehow unexpectedly, I think that the work around to ensure that yaml keys are treated as strings not integers although intended to only apply within the scope of the test function must be working more globally, as an error related to this occurs now in a subsequent test ---- expecting a string, but getting an int:

A couple of failing tests on the image list
ERROR: test_6_example_generate (tests.tests.tests.test_6_example_generate)

Generate resources for example region.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/ghsci/process/tests/tests.py", line 132, in test_6_example_generate
    r.generate()
  File "/home/ghsci/process/subprocesses/ghsci.py", line 1028, in generate
    generate_resources(self)
  File "/home/ghsci/process/generate.py", line 114, in generate
    r.generate_report(language=language, report='indicators')
  File "/home/ghsci/process/subprocesses/ghsci.py", line 1080, in generate_report
    generate_report_for_language(
  File "/home/ghsci/process/subprocesses/_utils.py", line 211, in generate_report_for_language
    phrases = r.get_phrases(language)
              ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ghsci/process/subprocesses/ghsci.py", line 1842, in get_phrases
    phrases[f'Image {i} file'] = city_details['images'][i]['file']
                                 ~~~~~~~~~~~~~~~~~~~~~~^^^
KeyError: 5

======================================================================
ERROR: test_8_example_generate_report_in_another_language (tests.tests.tests.test_8_example_generate_report_in_another_language)
Generate resources for example region.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/ghsci/process/tests/tests.py", line 174, in test_8_example_generate_report_in_another_language
    r.generate_report('Spanish - Latin America')
  File "/home/ghsci/process/subprocesses/ghsci.py", line 1080, in generate_report
    generate_report_for_language(
  File "/home/ghsci/process/subprocesses/_utils.py", line 211, in generate_report_for_language
    phrases = r.get_phrases(language)
              ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ghsci/process/subprocesses/ghsci.py", line 1842, in get_phrases
    phrases[f'Image {i} file'] = city_details['images'][i]['file']
                                 ~~~~~~~~~~~~~~~~~~~~~~^^^
KeyError: 5

Not sure how to resolve this right now -- but that's the issue, can address tomorrow. Must be some simple way to specify the parsing of keys only as strings only within the scope of the schema validation test.

…Test workflow appears to pass now, without unintended downstream consequences of modifying pyyaml constructors.
@carlhiggs carlhiggs requested a review from shiqin-liu March 18, 2025 08:39
@shiqin-liu
Copy link
Collaborator

Thanks for the work Carl! As this looks like a work-in-progress to me, let's keep it i the development branch for now, rather than merging into the main. This work deserve more thinkings and discussion, circle back to the goal of replacing with Json schema, I see there are opportunities to break down based on different functional inputs, and possibly to separate the input data parameters with meta data information etc.

@carlhiggs
Copy link
Collaborator Author

carlhiggs commented Mar 21, 2025 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants