Skip to content

Commit 170199e

Browse files
committed
feat: Implement HTML pre-processing option
1 parent 38fcc4b commit 170199e

12 files changed

+249
-81
lines changed

Makefile

+1
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@ BASIC_DUTIES = \
1717
docs \
1818
docs-deploy \
1919
format \
20+
manpage \
2021
release
2122

2223
QUALITY_DUTIES = \

README.md

+75-6
Original file line numberDiff line numberDiff line change
@@ -16,13 +16,13 @@ Pandoc must be [installed](https://pandoc.org/installing.html) and available as
1616

1717
With `pip`:
1818
```bash
19-
pip install mkdocs-manpage
19+
pip install mkdocs-manpage[preprocess]
2020
```
2121

2222
With [`pipx`](https://github.com/pipxproject/pipx):
2323
```bash
2424
python3.7 -m pip install --user pipx
25-
pipx install mkdocs-manpage
25+
pipx install mkdocs-manpage[preprocess]
2626
```
2727

2828
## Usage
@@ -31,15 +31,85 @@ pipx install mkdocs-manpage
3131
# mkdocs.yml
3232
plugins:
3333
- manpage:
34-
enabled: !ENV [MANPAGE, false]
3534
pages:
3635
- index.md
3736
- usage.md
3837
- reference/api.md
3938
```
4039
41-
We also recommend disabling some options from other plugins/extensions
42-
to improve the final manual page:
40+
To enable/disable the plugin with an environment variable:
41+
42+
```yaml
43+
# mkdocs.yml
44+
plugins:
45+
- manpage:
46+
enabled: !ENV [MANPAGE, false]
47+
```
48+
49+
Then set the environment variable and run MkDocs:
50+
51+
```bash
52+
MANPAGE=true mkdocs build
53+
```
54+
55+
The manpage will be written into the root of the site directory
56+
and named `manpage.1`.
57+
58+
### Pre-processing HTML
59+
60+
This plugin works by concatenating the HTML from all selected pages
61+
into a single file that is then converted to a manual page using Pandoc.
62+
63+
With a complete conversion, the final manual page will not look so good.
64+
For example images and SVG will be rendered as long strings of data and/or URIs.
65+
So this plugin allows users to pre-process the HTML, to remove unwanted
66+
HTML elements before converting the whole thing to a manpage.
67+
68+
First, you must make sure to install the `preprocess` extra:
69+
70+
```bash
71+
pip install mkdocs-manpage[preprocess]
72+
```
73+
74+
To pre-process the HTML, we use [BeautifulSoup](https://pypi.org/project/beautifulsoup4/).
75+
Users have to write their own `preprocess` function in order to modify the soup
76+
returned by BeautifulSoup:
77+
78+
```python title="scripts/preprocess.py"
79+
from bs4 import BeautifulSoup, Tag
80+
81+
82+
def to_remove(tag: Tag) -> bool:
83+
# remove images and SVGs
84+
if tag.name in {"img", "svg"}:
85+
return True
86+
# remove links containing images or SVGs
87+
if tag.name == "a" and tag.img and to_remove(tag.img):
88+
return True
89+
# remove permalinks
90+
if tag.name == "a" and "headerlink" in tag.get("class", ()):
91+
return True
92+
return False
93+
94+
95+
def preprocess(soup: BeautifulSoup) -> None:
96+
for element in soup.find_all(to_remove):
97+
element.decompose()
98+
```
99+
100+
Then, instruct the plugin to use this module and its `preprocess` function:
101+
102+
```yaml title="mkdocs.yml"
103+
plugins:
104+
- manpage:
105+
preprocess: scripts/preprocess.py
106+
```
107+
108+
See the documentation of both [`BeautifulSoup`][bs4.BeautifulSoup] and [`Tag`][bs4.Tag]
109+
to know what methods are available to correctly select the elements to remove.
110+
111+
The alternative to HTML processing for improving the final manpage
112+
is disabling some options from other plugins/extensions:
43113

44114
- no source code through `mkdocstrings`:
45115

@@ -67,5 +137,4 @@ export MANPAGE=true
67137
export PERMALINK=false
68138
export SHOW_SOURCE=false
69139
mkdocs build
70-
# manpage is in site dir: ./site/manpage.1
71140
```

docs/insiders/changelog.md

+4
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,10 @@
22

33
## MkDocs Manpage Insiders
44

5+
### 1.1.0 <small>June 07, 2023</small> { id="1.1.0" }
6+
7+
- Implement an HTML pre-processing option, to improve manpage rendering
8+
59
### 1.0.0 <small>June 06, 2023</small> { id="1.0.0" }
610

711
- Release first Insiders version

docs/license.md

+2
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
# License
2+
13
```
24
--8<-- "LICENSE"
35
```

duties.py

-4
Original file line numberDiff line numberDiff line change
@@ -309,9 +309,5 @@ def manpage(ctx: Context) -> None:
309309
Parameters:
310310
ctx: The context instance (passed automatically).
311311
"""
312-
os.environ["MANPAGE"] = "true"
313-
os.environ["SHOW_SOURCE"] = "false"
314-
os.environ["PERMALINK"] = "false"
315-
os.environ["DEPLOY"] = "false"
316312
ctx.run(mkdocs.build(), title="Building docs and manpage")
317313
ctx.run("man ./site/manpage.1", capture=False)

mkdocs.yml

+2-1
Original file line numberDiff line numberDiff line change
@@ -108,6 +108,7 @@ plugins:
108108
python:
109109
import:
110110
- https://docs.python.org/3/objects.inv
111+
- https://www.crummy.com/software/BeautifulSoup/bs4/doc/objects.inv
111112
options:
112113
separate_signature: true
113114
merge_init_into_class: true
@@ -118,7 +119,7 @@ plugins:
118119
enabled: !ENV [DEPLOY, false]
119120
repository: pawamoy/mkdocs-manpage
120121
- manpage:
121-
enabled: !ENV [MANPAGE, false]
122+
preprocess: scripts/preprocess.py
122123
pages:
123124
- index.md
124125
- changelog.md

pyproject.toml

+9-1
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,15 @@ classifiers = [
2727
"Topic :: Utilities",
2828
"Typing :: Typed",
2929
]
30-
dependencies = []
30+
dependencies = [
31+
"importlib-metadata>=4; python_version < '3.8'"
32+
]
33+
34+
[project.optional-dependencies]
35+
preprocess = [
36+
"beautifulsoup4>=4.12",
37+
"lxml>=4.9",
38+
]
3139

3240
[project.urls]
3341
Homepage = "https://pawamoy.github.io/mkdocs-manpage"

scripts/gen_credits.py

+2
Original file line numberDiff line numberDiff line change
@@ -89,6 +89,8 @@ def _render_credits() -> str:
8989
}
9090
template_text = dedent(
9191
"""
92+
# Credits
93+
9294
These projects were used to build *{{ project_name }}*. **Thank you!**
9395
9496
[`python`](https://www.python.org/) |

scripts/preprocess.py

+37
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
"""HTML pre-processing module."""
2+
3+
from __future__ import annotations
4+
5+
from typing import TYPE_CHECKING
6+
7+
if TYPE_CHECKING:
8+
from bs4 import BeautifulSoup as Soup
9+
from bs4 import Tag
10+
11+
12+
def to_remove(tag: Tag) -> bool:
13+
"""Tell whether a tag should be removed from the soup.
14+
15+
Parameters:
16+
tag: The tag to check.
17+
18+
Returns:
19+
True or false.
20+
"""
21+
# Remove images and SVGs.
22+
if tag.name in {"img", "svg"}:
23+
return True
24+
# Remove permalinks.
25+
if tag.name == "a" and ("headerlink" in tag.get("class", "") or tag.img and to_remove(tag.img)):
26+
return True
27+
return False
28+
29+
30+
def preprocess(soup: Soup) -> None:
31+
"""Pre-process the soup by removing elements.
32+
33+
Parameters:
34+
soup: The soup to modify.
35+
"""
36+
for element in soup.find_all(to_remove):
37+
element.decompose()

src/mkdocs_manpage/config.py

+1-57
Original file line numberDiff line numberDiff line change
@@ -2,69 +2,13 @@
22

33
from __future__ import annotations
44

5-
import contextlib
6-
from typing import Any, Dict, Generic, TypeVar
7-
85
from mkdocs.config import config_options as mkconf
9-
from mkdocs.config.base import Config
106
from mkdocs.config.base import Config as BaseConfig
11-
from mkdocs.config.config_options import BaseConfigOption, LegacyConfig, ValidationError
12-
13-
T = TypeVar("T")
14-
15-
16-
# TODO: remove once https://github.com/mkdocs/mkdocs/pull/3242 is merged and released
17-
class DictOfItems(Generic[T], BaseConfigOption[Dict[str, T]]):
18-
"""Validates a dict of items. Keys are always strings.
19-
20-
E.g. for `config_options.DictOfItems(config_options.Type(int))` a valid item is `{"a": 1, "b": 2}`.
21-
"""
22-
23-
required: bool | None = None # Only for subclasses to set.
24-
25-
def __init__(self, option_type: BaseConfigOption[T], default: Any = None) -> None: # noqa: D107
26-
super().__init__()
27-
self.default = default
28-
self.option_type = option_type
29-
self.option_type.warnings = self.warnings
30-
31-
def __repr__(self) -> str:
32-
return f"{type(self).__name__}: {self.option_type}"
33-
34-
def pre_validation(self, config: Config, key_name: str) -> None: # noqa: D102
35-
self._config = config
36-
self._key_name = key_name
37-
38-
def run_validation(self, value: object) -> dict[str, T]: # noqa: D102
39-
if value is None:
40-
if self.required or self.default is None:
41-
raise ValidationError("Required configuration not provided.")
42-
value = self.default
43-
if not isinstance(value, dict):
44-
raise ValidationError(f"Expected a dict of items, but a {type(value)} was given.")
45-
if not value: # Optimization for empty list
46-
return value
47-
48-
fake_config = LegacyConfig(())
49-
with contextlib.suppress(AttributeError):
50-
fake_config.config_file_path = self._config.config_file_path
51-
52-
# Emulate a config-like environment for pre_validation and post_validation.
53-
fake_config.data = value
54-
55-
for key_name in fake_config:
56-
self.option_type.pre_validation(fake_config, key_name)
57-
for key_name in fake_config:
58-
# Specifically not running `validate` to avoid the OptionallyRequired effect.
59-
fake_config[key_name] = self.option_type.run_validation(fake_config[key_name])
60-
for key_name in fake_config:
61-
self.option_type.post_validation(fake_config, key_name)
62-
63-
return value
647

658

669
class PluginConfig(BaseConfig):
6710
"""Configuration options for the plugin."""
6811

6912
enabled = mkconf.Type(bool, default=True)
7013
pages = mkconf.ListOfItems(mkconf.Type(str))
14+
preprocess = mkconf.File(exists=True)

0 commit comments

Comments
 (0)