-
-
Notifications
You must be signed in to change notification settings - Fork 18.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC/TST: Validate documentation pages #24216
Conversation
Signed-off-by: Fabian Haase <haase.fabian@gmail.com>
Signed-off-by: Fabian Haase <haase.fabian@gmail.com>
Signed-off-by: Fabian Haase <haase.fabian@gmail.com>
Hello @FHaase! Thanks for updating the PR.
Comment last updated on December 10, 2018 at 23:59 Hours UTC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking care of this @FHaase
Added some comments. If possible, would be nice to add it to ci/code_checks.py
in this PR, even if it's for a single document.
Btw, I think you're based in Germany. Not sure if you're subscribed to the pandas-dev list, but in case you're not and you may be interested: https://mail.python.org/pipermail/pandas-dev/2018-December/000862.html
doc/make.py
Outdated
@@ -21,6 +21,14 @@ | |||
import webbrowser | |||
import jinja2 | |||
|
|||
sys.path.insert(0, os.path.abspath('../scripts')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm -1 on importing anything in scripts/
from other places. When we make changes to the scripts, we don't need to worry about breaking other parts of the code at the moment. I like to keep it this way.
I think this part is not essential, is it?
@@ -0,0 +1,227 @@ | |||
import argparse |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should have a comment at the beginning of the script explaining what it does and how to use it.
@@ -0,0 +1,227 @@ | |||
import argparse | |||
from fnmatch import fnmatch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
haven't used fnmatch
, but I prefer to import modules with submodules having the same name (like datetime
) as import fnmatch
. Then, in the code, when you see fnmatch.fnmatch
there is no ambiguity whether it's the module or the submodule.
scripts/validate_documentation.py
Outdated
import re | ||
import sys | ||
|
||
import docutils.nodes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docutils is part of the standard library, right? why the blank line if that's the case?
scripts/validate_documentation.py
Outdated
|
||
DOCUMENTATION_SOURCE = os.path.join(os.curdir, '../doc/source') | ||
DOCTREE_PATH = '../doc/build/doctrees/{}.doctree' | ||
RST_PATH = '../doc/source/{}.rst' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are assuming here that the current directory (pwd
) is scripts/
, but the script can also be called as ./scripts/validate_documentation.py
, and I think those paths will be wrong. It's usually a good practice to have a BASE_PATH
first that is the root of the project, and use it as the base for all those. I'd say you should find the code in other scripts.
Also, better use os.path.join
instead of the hardcoded /
, so this can also run in windows.
|
||
|
||
class DocumentChecker(object): | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be a short docstring for the class? If you explain what the script does at the beginning of the file there is not much to say here, but a comment on what does this class do, or how is it expected to be used, can be useful.
issue.append((self.find_line(match), kwargs)) | ||
|
||
def find_line(self, match): | ||
if not match: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add docstrings to those too? It's not obvious what they do
scripts/validate_documentation.py
Outdated
|
||
result = {} | ||
for root, dirs, files in os.walk(DOCUMENTATION_SOURCE): | ||
_, base_dir = root.split('../doc/source') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use os.path.join
scripts/validate_documentation.py
Outdated
return self.issues | ||
|
||
|
||
def report(reports, output_format='default', errors=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't find report
descriptive enough
scripts/validate_documentation.py
Outdated
'a single document)') | ||
add('--exclude', default=None, | ||
help='comma separated ' | ||
'patterns of pages to exclude. By default it ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the help message I don't understand how I should use this. --exclude=io.rst,api.rst
?
Signed-off-by: Fabian Haase <haase.fabian@gmail.com>
Codecov Report
@@ Coverage Diff @@
## master #24216 +/- ##
=======================================
Coverage 92.21% 92.21%
=======================================
Files 162 162
Lines 51763 51763
=======================================
Hits 47733 47733
Misses 4030 4030
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #24216 +/- ##
==========================================
+ Coverage 92.21% 92.21% +<.01%
==========================================
Files 162 162
Lines 51763 51763
==========================================
+ Hits 47733 47734 +1
+ Misses 4030 4029 -1
Continue to review full report at Codecov.
|
Signed-off-by: Fabian Haase <haase.fabian@gmail.com>
I've added the checks to |
Signed-off-by: Fabian Haase <haase.fabian@gmail.com>
Signed-off-by: Fabian Haase <haase.fabian@gmail.com>
Signed-off-by: Fabian Haase <haase.fabian@gmail.com>
Signed-off-by: pandas-docs-bot <pandas-docs-bot@localhost.foo>
@datapythonista The ci integration is in |
Nice, lots of stuff to fix :) I think you mention it somewhere, but I can't find it now. What's the reason to run in And why adding it to |
I opened #24281 to fix the tailing whitespace problems. We can create a separate issue to fix the lists, but I guess that will be harder to automate and we may need to do it manually. Btw, you can see how lists are rendered with the new theme here: https://pandas-dev.github.io/pandas-sphinx-theme/pr-datapythonista_base/generated/pandas.DataFrame.head.html#pandas.DataFrame.head I guess this won't fix the problem with the |
Adding it to The last thing is also the reason why it could not be implemented in
I don't really check the number of spaces in the file, but literally whether there is a |
BTW, as after the documentation build there are also all the docstrings generated for the classes functions, A possible solution for #22900 would be to run flake8-rst on the |
I see. My concern is in not making things too complex. There are lots of scripts in pandas, a lot of complexity in building the documentation... Linting the docs calling a script from Having something similar, but importing If it needs to run after (or during) the docs build. Would make sense to have this implemented as a sphinx extension? And I think it's generic enough to be independent of pandas and be used by other projects too. What do you think? Does it make sense? |
I added validation for tailing whitespaces in all files (not only documentation) in #24286. I think that may simplify the work here, and it was trivial to validate with a grep there. |
I don't think we'll merge this like it is. Not sure if that's too much complexity for the value it adds. But in any case, if we follow this approach, we should implement this as a sphinx extension, probably in a different package and install it via conda. |
git diff upstream/master -u -- "*.py" | flake8 --diff