Skip to content

Commit 04f213b

Browse files
ovflowddanielleadams
authored andcommitted
tools: add documentation regarding our api tooling
Introduces a proper imperative description of how the current API documentation build system works. Refs: nodejs/next-10#169 PR-URL: #45270 Reviewed-By: Michael Dawson <midawson@redhat.com>
1 parent c63d825 commit 04f213b

File tree

1 file changed

+296
-0
lines changed

1 file changed

+296
-0
lines changed

doc/contributing/api-documentation.md

+296
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,296 @@
1+
# Node.js API Documentation Tooling
2+
3+
The Node.js API documentation is generated by an in-house tooling that resides
4+
within the [tools/doc](https://github.com/nodejs/node/tree/main/tools/doc)
5+
directory.
6+
7+
The build process (using `make doc`) uses this tooling to parse the markdown
8+
files in [doc/api](https://github.com/nodejs/node/tree/main/doc/api) and
9+
generate the following:
10+
11+
1. Human-readable HTML in `out/doc/api/*.html`
12+
2. A JSON representation in `out/doc/api/*.json`
13+
14+
These are published to nodejs.org for multiple versions of Node.js. As an
15+
example the latest version of the human-readable HTML is published to
16+
[nodejs.org/en/doc](https://nodejs.org/en/docs/), and the latest version
17+
of the json documentation is published to
18+
[nodejs.org/api/all.json](https://nodejs.org/api/all.json)
19+
20+
<!-- TODO: Add docs about how the publishing process happens -->
21+
22+
**The key things to know about the tooling include:**
23+
24+
1. The entry-point is `tools/doc/generate.js`.
25+
2. The tooling supports the CLI arguments listed in the table below.
26+
3. The tooling processes one file at a time.
27+
4. The tooling uses a set of dependencies as described in the dependencies
28+
section.
29+
5. The tooling parses the input files and does several transformations to the
30+
AST (Abstract Syntax Tree).
31+
6. The tooling generates a JSON output that contains the metadata and content of
32+
the Markdown file.
33+
7. The tooling generates a HTML output that contains a human-readable and ready
34+
to-view version of the file.
35+
36+
This documentation serves the purpose of explaining the existing tooling
37+
processes, to allow easier maintenance and evolution of the tooling. It is not
38+
meant to be a guide on how to write documentation for Node.js.
39+
40+
#### Vocabulary & Good to Know's
41+
42+
* AST means "Abstract Syntax Tree" and it is a data structure that represents
43+
the structure of a certain data format. In our case, the AST is a "graph"
44+
representation of the contents of the Markdown file.
45+
* MDN means [Mozilla Developer Network](https://developer.mozilla.org/en-US/)
46+
and it is a website that contains documentation for web technologies. We use
47+
it as a reference for the structure of the documentation.
48+
* The
49+
[Stability Index](https://nodejs.org/dist/latest/docs/api/documentation.html#stability-index)
50+
is used to community the Stability of a given Node.js module. The Stability
51+
levels include:
52+
* Stability 0: Deprecated. (This module is Deprecated)
53+
* Stability 1: Experimental. (This module is Experimental)
54+
* Stability 2: Stable. (This module is Stable)
55+
* Stability 3: Legacy. (This module is Legacy)
56+
* Within Remark YAML snippets `<!-- something -->` are considered HTML nodes,
57+
that's because YAML isn't valid Markdown content. (Doesn't abide by the
58+
Markdown spec)
59+
* "New Tooling" references to the (written from-scratch) API build tooling
60+
introduced in `nodejs/nodejs.dev` that might replace the current one from
61+
`nodejs/node`
62+
63+
## CLI Arguments
64+
65+
The tooling requires a `filename` argument and supports extra arguments (some
66+
also required) as shown below:
67+
68+
| Argument | Description | Required | Example |
69+
| --------------------- | -------------------------------------------------------------------------------------------------------------------------------------- | -------- | ---------------------------------- |
70+
| `--node-version=` | The version of Node.js that is being documented. It defaults to `process.version` which is supplied by Node.js itself | No | v19.0.0 |
71+
| `--output-directory=` | The directory where the output files will be generated. | Yes | `./out/api/` |
72+
| `--apilinks=` | This file is used as an index to specify the source file for each module | No | `./out/doc/api/apilinks.json` |
73+
| `--versions-file=` | This file is used to specify an index of all previous versions of Node.js. It is used for the Version Navigation on the API docs page. | No | `./out/previous-doc-versions.json` |
74+
75+
**Note:** both of the `apilinks` and `versions-file` parameters are generated by
76+
the Node.js build process (Makefile). And they're files containing a JSON
77+
object.
78+
79+
### Basic Usage
80+
81+
```bash
82+
# cd tools/doc
83+
npm run node-doc-generator ${filename}
84+
```
85+
86+
**OR**
87+
88+
```bash
89+
# nodejs/node root directory
90+
make doc
91+
```
92+
93+
## Dependencies and how the Tooling works internally
94+
95+
The API tooling uses an-AST-alike library called
96+
[unified](https://github.com/unifiedjs/unified) for processing the Input file as
97+
a Graph that supports easy modification and update of its nodes.
98+
99+
In addition to `unified` we also use
100+
[Remark](https://github.com/remarkjs/remark) for manipulating the Markdown part,
101+
and [Rehype](https://github.com/rehypejs/rehype)to help convert to and from
102+
Markdown.
103+
104+
### What are the steps of the internal tooling?
105+
106+
The tooling uses `unified` pipe-alike engine to pipe each part of the process.
107+
(The description below is a simplified version)
108+
109+
* Starting from reading the Frontmatter section of the Markdown file with
110+
[remark-frontmatter](https://www.npmjs.com/package/remark-frontmatter).
111+
* Then the tooling goes to parse the Markdown by using `remark-parse` and adds
112+
support to [GitHub Flavoured Markdown](https://github.github.com/gfm/).
113+
* The tooling proceeds by parsing some of the Markdown nodes and transforming
114+
them to HTML.
115+
* The tooling proceeds to generate the JSON output of the file.
116+
* Finally it does its final node transformations and generates a stringified
117+
HTML.
118+
* It then stores the output to a JSON file and adds extra styling to the HTML
119+
and then stores the HTML file.
120+
121+
### What each file is responsible for?
122+
123+
The files listed below are the ones referenced and actually used during the
124+
build process of the API docs as we see on <https://nodejs.org/api>. The
125+
remaining files from the directory might be used by other steps of the Node.js
126+
Makefile or might even be deprecated/remnant of old processes and might need to
127+
be revisited/removed.
128+
129+
* **`html.mjs`**: Responsible for transforming nodes by decorating them with
130+
visual artifacts for the HTML pages;
131+
* For example, transforming man or JS doc references to links correctly
132+
referring to respective External documentation.
133+
* **`json.mjs`**: Responsible for generating the JSON output of the file;
134+
* It is mostly responsible for going through the whole Markdown file and
135+
generating a JSON object that represent the Metadata of a specific Module.
136+
* For example, for the FS module, it will generate an object with all its
137+
methods, events, classes and use several regular expressions (ReGeX) for
138+
extracting the information needed.
139+
* **`generate.mjs`**: Main entry-point of doc generation for a specific file. It
140+
does e2e processing of a documentation file;
141+
* **`allhtml.mjs`**: A script executed after all files are generated to create a
142+
single "all" page containing all the HTML documentation;
143+
* **`alljson.mjs`**: A script executed after all files are generated to create a
144+
single "all" page containing all the JSON entries;
145+
* **`markdown.mjs`**: Contains utility to replace Markdown links to work with
146+
the <https://nodejs.org/api/> website.
147+
* **`common.mjs`**: Contains a few utility functions that are used by the other
148+
files.
149+
* **`type-parser.mjs`**: Used to replace "type references" (e.g. "String", or
150+
"Buffer") to the correct Internal/External documentation pages (i.e. MDN or
151+
other Node.js documentation pages).
152+
153+
**Note:** It is important to mention that other files not mentioned here might
154+
be used during the process but are not relevant to the generation of the API
155+
docs themselves. You will notice that a lot of the logic within the build
156+
process is **specific** to the current <https://nodejs.org/api/> infrastructure.
157+
Just as adding some JavaScript snippets, styles, transforming certain Markdown
158+
elements into HTML, and adding certain HTML classes or such things.
159+
160+
**Note:** Regarding the previous **Note** it is important to mention that we're
161+
currently working on an API tooling that is generic and independent of the
162+
current Nodejs.org Infrastructure.
163+
[The new tooling that is functional is available at the nodejs.dev repository](https://github.com/nodejs/nodejs.dev/blob/main/scripts/syncApiDocs.js)
164+
and uses plain ReGeX (No AST) and [MDX](https://mdxjs.com/).
165+
166+
## The Build Process
167+
168+
The build process that happens on `generate.mjs` follows the steps below:
169+
170+
* Links within the Markdown are replaced directly within the source Markdown
171+
(AST) (`markdown.replaceLinks`)
172+
* This happens within `markdown.mjs` and basically it adds suffixes or
173+
modifies link references within the Markdown
174+
* This is necessary for the `https://nodejs.org` infrastructure as all pages
175+
are suffixed with `.html`
176+
* Text (and some YAML) Nodes are transformed/modified through
177+
`html.preprocessText`
178+
* JSON output is generated through `json.jsonAPI`
179+
* The title of the page is inferred through `html.firstHeader`
180+
* Nodes are transformed into HTML Elements through `html.preprocessElements`
181+
* The HTML Table of Contents (ToC) is generated through `html.buildToc`
182+
183+
### `html.mjs`
184+
185+
This file is responsible for doing node AST transformations that either update
186+
Markdown nodes to decorate them with more data or transform them into HTML Nodes
187+
that attain a certain visual responsibility; For example, to generate the "Added
188+
at" label, or the Source Links or the Stability Index, or the History table.
189+
190+
**Note:** Methods not listed below are either not relevant or utility methods
191+
for string/array/object manipulation (e.g.: are used by the other methods
192+
mentioned below).
193+
194+
#### `preprocessText`
195+
196+
**New Tooling:** Most of the features within this method are available within
197+
the new tooling.
198+
199+
This method does two things:
200+
201+
* Replaces the Source Link YAML entry `<-- source_link= -->` into a "Source
202+
Link" HTML anchor element.
203+
* Replaces type references within the Markdown (text) (i.e.: "String", "Buffer")
204+
into the correct HTML anchor element that links to the correct documentation
205+
page.
206+
* The original node then gets mutated from text to HTML.
207+
* It also updates references to Linux "MAN" pages to Web versions of them.
208+
209+
#### `firstHeader`
210+
211+
**New Tooling:** All features within this method are available within the new
212+
Tooling.
213+
214+
Is used to attempt to extract the first heading of the page (recursively) to
215+
define the "title" of the page.
216+
217+
**Note:** As all API Markdown files start with a Heading, this could possibly be
218+
improved to a reduced complexity.
219+
220+
#### `preprocessElements`
221+
222+
**New Tooling:** All features within this method are available within the new
223+
tooling.
224+
225+
This method is responsible for doing multiple transformations within the AST
226+
Nodes, in majority, transforming the source node in respective HTML elements
227+
with diverse responsibilities, such as:
228+
229+
* Updating Markdown `code` blocks by adding Language highlighting
230+
* It also adds the "CJS"/"MJS" switch to Nodes that are followed by their
231+
CJS/ESM equivalents.
232+
* Increasing the Heading level of each Heading
233+
* Parses YAML blocks and transforms them into HTML elements (See more at the
234+
`parseYAML` method)
235+
* Updates BlockQuotes that are prefixed by the "Stability" word into a Stability
236+
Index HTML element.
237+
238+
#### `parseYAML`
239+
240+
**New Tooling:** Most of the features within this method are available within
241+
the new tooling.
242+
243+
This method is responsible for parsing the `<--YAML snippets -->` and
244+
transforming them into HTML elements.
245+
246+
It follows a certain kind of "schema" that basically constitues in the following
247+
options:
248+
249+
| YAML Key | Description | Example | Example Result | Available on new tooling |
250+
| ------------- | ------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------- | --------------------------- | ------------------------ |
251+
| `added` | It's used to reference when a certain "module", "class" or "method" was added on Node.js | `added: v0.1.90` | `Added in: v0.1.90` | Yes |
252+
| `deprecated` | It's used to reference when a certain "module", "class" or "method" was deprecated on Node.js | `deprecated: v0.1.90` | `Deprecated since: v0.1.90` | Yes |
253+
| `removed` | It's used to reference when a certain "module", "class" or "method" was removed on Node.js | `removed: v0.1.90` | `Removed in: v0.1.90` | No |
254+
| `changes` | It's used to describe all the changes (historical ones) that happened within a certain "module", "class" or "method" in Node.js | `[{ version: v0.1.90, pr-url: '', description: '' }]` | -- | Yes |
255+
| `napiVersion` | It's used to describe in which version of the N-API this "module", "class" or "method" is available within Node.js | `napiVersion: 1` | `N-API version: 1` | Yes |
256+
257+
**Note:** The `changes` field gets prepended with the `added`, `deprecated` and
258+
`removed` fields if they exist. The table only gets generated if a `changes`
259+
field exists. In the new tooling only "added" is prepended for now.
260+
261+
#### `buildToc`
262+
263+
**New Tooling:** This feature is natively available within the new tooling
264+
through MDX.
265+
266+
This method generates the Table of Contents based on all the Headings of the
267+
Markdown file.
268+
269+
#### `altDocs`
270+
271+
**New Tooling:** All features within this method are available within the new
272+
tooling.
273+
274+
This method generates a version picker for the current page to be shown in older
275+
versions of the API docs.
276+
277+
### `json.mjs`
278+
279+
This file is responsible for generating a JSON object that (supposedly) is used
280+
for IDE-Intellisense or for indexing of all the "methods", "classes", "modules",
281+
"events", "constants" and "globals" available within a certain Markdown file.
282+
283+
It attempts a best effort extraction of the data by using several regular
284+
expression patterns (ReGeX).
285+
286+
**Note:** JSON output generation is currently not supported by the new tooling,
287+
but it is in the pipeline for development.
288+
289+
#### `jsonAPI`
290+
291+
This method traverses all the AST Nodes by iterating through each one of them
292+
and infers the kind of information each node contains through ReGeX. Then it
293+
mutate the data and appends it to the final JSON object.
294+
295+
For a more in-depth information we recommend to refer to the `json.mjs` file as
296+
it contains a lot of comments.

0 commit comments

Comments
 (0)