Standards landing pages generated by build script #29
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The Python script build.py uses metadata from the CSV tables in the rs.tdwg.org repo to create standards landing pages that conform to the Standards Documentation Specification (SDS). To run it, one downloads the "stds-pages" branch of the repo onto a local drive, then runs the script. The script generates folders and Markdown files that mirror the structure of the standards directory of the website repo. The resulting index.md files generated by the script have been pushed to GitHub here so that they can be viewed from their respective standards directories in rendered form rather than as raw Markdown.
There are two compelling reasons for generating the pages by script rather than by hand. One is simply that it would be a lot of extra work to manually create all of the pages for the old standards, given that nearly all of the information about the standards is already present in the standards.csv, docs.csv, and docs-authors.csv files. But the other, more compelling reason is that the SDS implies that all representations of an abstract resource (such as a standard or document) should contain substantively the same metadata about that resource. The part of the standards landing page that is strictly controlled by the SDS (the header section) should provide exactly the same information as is included in machine-readable serializations. The way to ensure that is to generate the header section from the same information (the tables in the rs.tdwg.org repo) that is used to generate the machine-readable metadata.
There are several key requirements of Section 3.1 the SDS (regarding landing pages for standards) that this script satisfies that are not currently found in many of the existing landing pages:
The last item is the major feature that the script enables. It is critical in two ways: it makes it clear what documents are part of a standard (and by exclusion which ones are not) and it disambiguates standards and the documents that compose them. For some modern standards that include a single document, the distinction may not seem important, but for many of the older standards, TDWG is not the publisher of the document, even though by the act of ratification it is the publisher of the standard. It is also important to distinguish among the standard and documents because some standards contain many documents that may have different contributors, who should be acknowledged independently, as well as different publication dates (which may differ from the ratification date of the standard itself). Taxonomic Literature, Edition 2 and its Supplements is a notable example of this.
Note: using this build script doesn't require that all of the generated pages actually be used on the TDWG website. For landing pages of standards that are actively managed (like DwC or AC for example) it would probably be better to use manually-built Markdown. One could compare with the script-generated page to make sure that the header section is consistent, but create the rest of the page (e.g. links to documents included in the standard rather than full descriptions) manually. But for many of the older standards where it would be a pain to build the page manually (tl-2 is a notable example), the script-generated pages would be very useful. I did make an attempt to include manually generated content as part of the page - I simply copied the extra Markdown from the existing landing pages and pasted it into the "other" column of the pageInfo.csv file, newlines and all. So the most of the manually edited content will still appear in the automatically generated index.md files (although this may not be the best way to manage this content).
I have requested reviews by @peterdesmet (as Jekyll/website guru), @stanblum (as infrastructure czar), and @chicoreus (as fearless TAG leader). I've assigned the various issues below to the three of you based on my guess about which of your experience would be most relevant. The issues below have all been assigned to the "deploy standards landing pages generated by build script" milestone.
The following items need to be resolved before this can be fully implemented:
It would be desirable, but not required, to resolve the following issues before implementation:
This issue should be addressed after implementation: