linux-kernel - Re: [PATCH 2/8] Documentation/sphinx: fix Python string escapes

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <ab85e604-b7ba-dbbe-53c2-2454e145d829@linux.ibm.com>
Date:   Tue, 15 Aug 2023 09:26:10 +1000
From:   Benjamin Gray <bgray@...ux.ibm.com>
To:     Jonathan Corbet <corbet@....net>, linux-kernel@...r.kernel.org,
        linux-ia64@...r.kernel.org, linux-doc@...r.kernel.org,
        bpf@...r.kernel.org, linux-pm@...r.kernel.org
Cc:     abbotti@....co.uk, hsweeten@...ionengravers.com,
        jan.kiszka@...mens.com, kbingham@...nel.org, mykolal@...com
Subject: Re: [PATCH 2/8] Documentation/sphinx: fix Python string escapes

On 14/8/23 11:35 pm, Jonathan Corbet wrote:
> Benjamin Gray <bgray@...ux.ibm.com> writes:
> 
>> Python 3.6 introduced a DeprecationWarning for invalid escape sequences.
>> This is upgraded to a SyntaxWarning in Python 3.12, and will eventually
>> be a syntax error.
>>
>> Fix these now to get ahead of it before it's an error.
>>
>> Signed-off-by: Benjamin Gray <bgray@...ux.ibm.com>
>> ---
>>   Documentation/sphinx/cdomain.py             | 2 +-
>>   Documentation/sphinx/kernel_abi.py          | 2 +-
>>   Documentation/sphinx/kernel_feat.py         | 2 +-
>>   Documentation/sphinx/kerneldoc.py           | 2 +-
>>   Documentation/sphinx/maintainers_include.py | 8 ++++----
>>   5 files changed, 8 insertions(+), 8 deletions(-)
> 
> So I am the maintainer for this stuff...is there a reason you didn't
> copy me on this work?

Sorry, I thought the list linux-doc@...r.kernel.org itself was enough. I 
haven't done a cross tree series before, I was a bit adverse to CC'ing 
everyone that appears as a maintainer for every patch.

> 
>> diff --git a/Documentation/sphinx/cdomain.py b/Documentation/sphinx/cdomain.py
>> index ca8ac9e59ded..dbdc74bd0772 100644
>> --- a/Documentation/sphinx/cdomain.py
>> +++ b/Documentation/sphinx/cdomain.py
>> @@ -93,7 +93,7 @@ def markup_ctype_refs(match):
>>   #
>>   RE_expr = re.compile(r':c:(expr|texpr):`([^\`]+)`')
>>   def markup_c_expr(match):
>> -    return '\ ``' + match.group(2) + '``\ '
>> +    return '\\ ``' + match.group(2) + '``\\ '
> 
> I have to wonder about this one; I doubt the intent was to insert a
> literal backslash.  I have to fire up my ancient build environment to
> even try this, but even if it's right...

Yeah, there is even a file that just has a syntax error. I don't have a 
way to verify the original script was correct, but I have verified this 
series doesn't change the parsed AST.

In this case though, it's generating reST, so it might just be 
conservatively guarding against generating bad markup[1]

[1]: 
https://www.sphinx-doc.org/en/master/usage/restructuredtext/basics.html#inline-markup 


>>   #
>>   # Parse Sphinx 3.x C markups, replacing them by backward-compatible ones
>> diff --git a/Documentation/sphinx/kernel_abi.py b/Documentation/sphinx/kernel_abi.py
>> index b5feb5b1d905..b9f026f016fd 100644
>> --- a/Documentation/sphinx/kernel_abi.py
>> +++ b/Documentation/sphinx/kernel_abi.py
>> @@ -138,7 +138,7 @@ class KernelCmd(Directive):
>>                   code_block += "\n    " + l
>>               lines = code_block + "\n\n"
>>   
>> -        line_regex = re.compile("^\.\. LINENO (\S+)\#([0-9]+)$")
>> +        line_regex = re.compile("^\\.\\. LINENO (\\S+)\\#([0-9]+)$")
> 
> All of these really just want to be raw strings - a much more minimal
> fix that makes the result quite a bit more readable:
> 
>       line_regex = re.compile(r"^\.\. LINENO (\S+)\#([0-9]+)$")
>                               ^
>                               |
>    ---------------------------+
> 
> That, I think, is how these should be fixed.

Yup, I mentioned that at the end of the cover letter. I can automate and 
verify the conversion, but automating what _should_ be treated as a 
'regex' string is fuzzier. Checking if there's a `re.*(` prefix on the 
string should work for most though. I'll give it a shot.

> Thanks,
> 
> jon