[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87v7sy29rh.fsf@trenco.lwn.net>
Date: Mon, 24 Feb 2025 16:38:58 -0700
From: Jonathan Corbet <corbet@....net>
To: Mauro Carvalho Chehab <mchehab+huawei@...nel.org>, Linux Doc Mailing
List <linux-doc@...r.kernel.org>
Cc: Mauro Carvalho Chehab <mchehab+huawei@...nel.org>, "Gustavo A. R. Silva"
<mchehab+huawei@...nel.org>, Mauro Carvalho Chehab
<mchehab+huawei@...nel.org>, Kees Cook <mchehab+huawei@...nel.org>,
linux-hardening@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 09/39] scripts/kernel-doc.py: add a Python parser
Mauro Carvalho Chehab <mchehab+huawei@...nel.org> writes:
> Maintaining kernel-doc has been a challenge, as there aren't many
> perl developers among maintainers. Also, the logic there is too
> complex. Having lots of global variables and using pure functions
> doesn't help.
>
> Rewrite the script in Python, placing most global variables
> inside classes. This should help maintaining the script in long
> term.
[...]
> diff --git a/scripts/kernel-doc.py b/scripts/kernel-doc.py
> new file mode 100755
> index 000000000000..5cf5ed63f215
> --- /dev/null
> +++ b/scripts/kernel-doc.py
> @@ -0,0 +1,2757 @@
> +#!/usr/bin/env python3
> +# pylint: disable=R0902,R0903,R0904,R0911,R0912,R0913,R0914,R0915,R0917,R1702
> +# pylint: disable=C0302,C0103,C0301
> +# pylint: disable=C0116,C0115,W0511,W0613
> +# Copyright(c) 2025: Mauro Carvalho Chehab <mchehab@...nel.org>.
> +# SPDX-License-Identifier: GPL-2.0
The SPDX tag is supposed to be up top, right under the shebang
I also think you should give consideration to preserving the other
copyright notices in the Perl version. A language translation doesn't
remove existing copyrights...who knows how much creativity went into
some of those regexes?
> +# TODO: implement warning filtering
> +
> +"""
> +kernel_doc
> +==========
> +
> +Print formatted kernel documentation to stdout
> +
> +Read C language source or header FILEs, extract embedded
> +documentation comments, and print formatted documentation
> +to standard output.
> +
> +The documentation comments are identified by the "/**"
> +opening comment mark.
> +
> +See Documentation/doc-guide/kernel-doc.rst for the
> +documentation comment syntax.
> +"""
> +
> +import argparse
> +import logging
> +import os
> +import re
> +import sys
> +
> +from datetime import datetime
> +from pprint import pformat
> +
> +from dateutil import tz
> +
> +# Local cache for regular expressions
> +re_cache = {}
> +
> +
> +class Re:
So I have to say this bugs me a bit ... the class is fine, but the
one-letter case-only difference from the standard "re" class is just
going to make the code harder for others to approach. "kern_re" or
something like that? Or even "kre" if you really want it to be as short
as possible.
> + """
> + Helper class to simplify regex declaration and usage,
> +
> + It calls re.compile for a given pattern. It also allows adding
> + regular expressions and define sub at class init time.
> +
> + Regular expressions can be cached via an argument, helping to speedup
> + searches.
> + """
[...]
> +
> +class KernelDoc:
> + # Parser states
> + STATE_NORMAL = 0 # normal code
> + STATE_NAME = 1 # looking for function name
> + STATE_BODY_MAYBE = 2 # body - or maybe more description
> + STATE_BODY = 3 # the body of the comment
> + STATE_BODY_WITH_BLANK_LINE = 4 # the body which has a blank line
> + STATE_PROTO = 5 # scanning prototype
> + STATE_DOCBLOCK = 6 # documentation block
> + STATE_INLINE = 7 # gathering doc outside main block
> +
> + st_name = [
> + "NORMAL",
> + "NAME",
> + "BODY_MAYBE",
> + "BODY",
> + "BODY_WITH_BLANK_LINE",
> + "PROTO",
> + "DOCBLOCK",
> + "INLINE",
> + ]
So these ... kind of look like enums?
That's kind of it for nits ... I do have one wish that will kind of hard
to grant overall ... for the long-term maintenance of this code, it
would be really nice if every non-trivial regex were described by a
comment explaining what it is trying to do. It's not reasonable to
expect that as a condition for accepting this rewrite, but it sure would
be a nice goal to be working toward.
Thanks,
jon
Powered by blists - more mailing lists