linux-kernel - Re: [PATCH] docs: Make automarkup ready for Sphinx 3.1+

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20201008072757.0ca42539@coco.lan>
Date:   Thu, 8 Oct 2020 07:27:57 +0200
From:   Mauro Carvalho Chehab <mchehab+huawei@...nel.org>
To:     "Nícolas F. R. A. Prado" <nfraprado@...tonmail.com>,
        linux-doc@...r.kernel.org
Cc:     Jonathan Corbet <corbet@....net>, linux-kernel@...r.kernel.org,
        lkcamp@...ts.libreplanetbr.org, andrealmeid@...labora.com
Subject: Re: [PATCH] docs: Make automarkup ready for Sphinx 3.1+

Hi Nícolas,

Em Wed, 07 Oct 2020 23:12:25 +0000
Nícolas F. R. A. Prado <nfraprado@...tonmail.com> escreveu:

> While Sphinx 2 used a single c:type role for struct, union, enum and
> typedef, Sphinx 3 uses a specific role for each one.
> To keep backward compatibility, detect the Sphinx version and use the
> correct roles for that version.
> 
> Also, Sphinx 3 is more strict with its C domain and generated warnings,
> exposing issues in the parsing.
> To fix the warnings, make the C regexes use ASCII, ensure the
> expressions only match the beginning of words and skip trying to
> cross-reference C reserved words.
> 
> Signed-off-by: Nícolas F. R. A. Prado <nfraprado@...tonmail.com>
> ---
> 
> Hi,
> 
> after Mauro's series making everything ready for Sphinx 3.1, only the automarkup
> was left to be ported.
> This patch makes the needed changes to automarkup so that we can soon flip the
> switch to Sphinx 3.1.
> 
> This change was tested both with Sphinx 2.4.4 and Sphinx 3.1.
> 
> This change doesn't add any warnings to the Documentation build.
> I tested it with Mauro's series but later rebased it to docs-next, and it can be
> accepted independently of that series.
> 
> I ended up doing more than one thing in this single patch, but since it was all
> changing the same lines and for the same purpose, I felt it would be better to
> keep it as a single commit.
> 

Thanks for doing this! That was the last missing part on fully supporting
Sphinx 3.1+.

> Mauro,
> if this patch is ok, the 3rd patch in your series, which disables automarkup for
> sphinx 3, should be dropped.

Yeah, sure.

> Although I'm not sure what the implications of your patches adding namespaces
> and using the c:macro for functions are.

With regards to namespaces:

Currently, only the media docs use namespaces, and it declares it at the
beginning of each file that needs it, without overriding it later[1].

[1] btw, the cdomain.py backward compat code doesn't support namespace
    changes - as it parses namespaces before handling the C domain tags.
    I doubt that we'll need to have a single .rst file using more than
    one namespace anyway.

The main usage is to avoid conflicts for uAPI documentation for
syscalls - actually for libc userspace wrappers to syscalls. It documents
things like: open, close, read, write, ioctl, poll, select.

I'm not sure if the automarkup should be aware of it, or if the c.py code
at Sphinx 3.x will add the namespace automatically, but I suspect that
automarkup will need to handle it as well.

One file you could use for checking it is this one:

	Documentation/userspace-api/media/v4l/hist-v4l2.rst

It contains a namespace directive and documents what changed without
using any explicit reference (after my patch series + linux-next).

With regards to c:macro vs c:function:

I suspect that automarkup should test both when trying to do 
cross-references for function-like calls. E. g. test first if
there is a :c:function, falling back to check for :c:macro.

I would add a "sphinx3_c_func_ref" function that would handle
such special case, e. g. something like:

    markup_func_sphinx3 = {RE_doc: markup_doc_ref,
                           RE_function: sphinx3_c_func_ref,
                           RE_struct: markup_c_ref,
                           RE_union: markup_c_ref,
                           RE_enum: markup_c_ref,
                           RE_typedef: markup_c_ref}

> All I did here was use the specific roles for sphinx 3 and fix the warnings, but
> that was enough to get correct cross-references even after your series.
> 
> Thanks,
> Nícolas

> 
> 
>  Documentation/sphinx/automarkup.py | 69 ++++++++++++++++++++++++++----
>  1 file changed, 60 insertions(+), 9 deletions(-)
> 
> diff --git a/Documentation/sphinx/automarkup.py b/Documentation/sphinx/automarkup.py
> index a1b0f554cd82..fd1e927408ad 100644
> --- a/Documentation/sphinx/automarkup.py
> +++ b/Documentation/sphinx/automarkup.py
> @@ -22,13 +22,34 @@ from itertools import chain
>  # :c:func: block (i.e. ":c:func:`mmap()`s" flakes out), so the last
>  # bit tries to restrict matches to things that won't create trouble.
>  #
> -RE_function = re.compile(r'(([\w_][\w\d_]+)\(\))')
> -RE_type = re.compile(r'(struct|union|enum|typedef)\s+([\w_][\w\d_]+)')
> +RE_function = re.compile(r'\b(([a-zA-Z_]\w+)\(\))', flags=re.ASCII)
> +
> +#
> +# Sphinx 2 uses the same :c:type role for struct, union, enum and typedef
> +#
> +RE_generic_type = re.compile(r'\b(struct|union|enum|typedef)\s+([a-zA-Z_]\w+)',
> +                             flags=re.ASCII)
> +
> +#
> +# Sphinx 3 uses a different C role for each one of struct, union, enum and
> +# typedef
> +#
> +RE_struct = re.compile(r'\b(struct)\s+([a-zA-Z_]\w+)', flags=re.ASCII)
> +RE_union = re.compile(r'\b(union)\s+([a-zA-Z_]\w+)', flags=re.ASCII)
> +RE_enum = re.compile(r'\b(enum)\s+([a-zA-Z_]\w+)', flags=re.ASCII)
> +RE_typedef = re.compile(r'\b(typedef)\s+([a-zA-Z_]\w+)', flags=re.ASCII)
> +
>  #
>  # Detects a reference to a documentation page of the form Documentation/... with
>  # an optional extension
>  #
> -RE_doc = re.compile(r'Documentation(/[\w\-_/]+)(\.\w+)*')
> +RE_doc = re.compile(r'\bDocumentation(/[\w\-_/]+)(\.\w+)*')
> +
> +#
> +# Reserved C words that we should skip when cross-referencing
> +#
> +Skipnames = [ 'for', 'if', 'register', 'sizeof', 'struct', 'unsigned' ]
> +
>  
>  #
>  # Many places in the docs refer to common system calls.  It is
> @@ -48,9 +69,22 @@ def markup_refs(docname, app, node):
>      #
>      # Associate each regex with the function that will markup its matches
>      #
> -    markup_func = {RE_type: markup_c_ref,
> -                   RE_function: markup_c_ref,
> -                   RE_doc: markup_doc_ref}
> +    markup_func_sphinx2 = {RE_doc: markup_doc_ref,
> +                           RE_function: markup_c_ref,
> +                           RE_generic_type: markup_c_ref}
> +
> +    markup_func_sphinx3 = {RE_doc: markup_doc_ref,
> +                           RE_function: markup_c_ref,
> +                           RE_struct: markup_c_ref,
> +                           RE_union: markup_c_ref,
> +                           RE_enum: markup_c_ref,
> +                           RE_typedef: markup_c_ref}
> +
> +    if sphinx.__version__[0] == '3':
> +        markup_func = markup_func_sphinx3
> +    else:
> +        markup_func = markup_func_sphinx2
> +
>      match_iterators = [regex.finditer(t) for regex in markup_func]
>      #
>      # Sort all references by the starting position in text
> @@ -79,8 +113,24 @@ def markup_refs(docname, app, node):
>  # type_name) with an appropriate cross reference.
>  #
>  def markup_c_ref(docname, app, match):
> -    class_str = {RE_function: 'c-func', RE_type: 'c-type'}
> -    reftype_str = {RE_function: 'function', RE_type: 'type'}
> +    class_str = {RE_function: 'c-func',
> +                 # Sphinx 2 only
> +                 RE_generic_type: 'c-type',
> +                 # Sphinx 3+ only
> +                 RE_struct: 'c-struct',
> +                 RE_union: 'c-union',
> +                 RE_enum: 'c-enum',
> +                 RE_typedef: 'c-type',
> +                 }
> +    reftype_str = {RE_function: 'function',
> +                   # Sphinx 2 only
> +                   RE_generic_type: 'type',
> +                   # Sphinx 3+ only
> +                   RE_struct: 'struct',
> +                   RE_union: 'union',
> +                   RE_enum: 'enum',
> +                   RE_typedef: 'type',
> +                   }
>  
>      cdom = app.env.domains['c']
>      #
> @@ -89,7 +139,8 @@ def markup_c_ref(docname, app, match):
>      target = match.group(2)
>      target_text = nodes.Text(match.group(0))
>      xref = None
> -    if not (match.re == RE_function and target in Skipfuncs):
> +    if not ((match.re == RE_function and target in Skipfuncs)
> +            or (target in Skipnames)):
>          lit_text = nodes.literal(classes=['xref', 'c', class_str[match.re]])
>          lit_text += target_text
>          pxref = addnodes.pending_xref('', refdomain = 'c',



Thanks,
Mauro