linux-kernel - Re: [RFC] scripts: kernel-doc: fix typedef support for struct parsing

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <875z2jlr2j.fsf@meer.lwn.net>
Date:   Mon, 22 Feb 2021 14:40:04 -0700
From:   Jonathan Corbet <corbet@....net>
To:     Aditya Srivastava <yashsri421@...il.com>
Cc:     yashsri421@...il.com, lukas.bulwahn@...il.com,
        linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-kernel-mentees@...ts.linuxfoundation.org
Subject: Re: [RFC] scripts: kernel-doc: fix typedef support for struct parsing

Aditya Srivastava <yashsri421@...il.com> writes:

> There are files in kernel, which use 'typedef struct' syntax for defining
> struct. For eg, include/linux/zstd.h, drivers/scsi/megaraid/mega_common.h,
> etc.
> However, kernel-doc still does not support it, causing a parsing error.
>
> For eg, running scripts/kernel-doc -none on include/linux/zstd.h emits:
> "error: Cannot parse struct or union!"
>
> Add support for parsing it.
>
> Signed-off-by: Aditya Srivastava <yashsri421@...il.com>
> ---
>  scripts/kernel-doc | 12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/scripts/kernel-doc b/scripts/kernel-doc
> index 8b5bc7bf4bb8..46e904dc3f87 100755
> --- a/scripts/kernel-doc
> +++ b/scripts/kernel-doc
> @@ -1201,12 +1201,20 @@ sub dump_union($$) {
>  sub dump_struct($$) {
>      my $x = shift;
>      my $file = shift;
> +    my $decl_type;
> +    my $members;
>  
>      if ($x =~ /(struct|union)\s+(\w+)\s*\{(.*)\}(\s*(__packed|__aligned|____cacheline_aligned_in_smp|____cacheline_aligned|__attribute__\s*\(\([a-z0-9,_\s\(\)]*\)\)))*/) {
> -	my $decl_type = $1;
> +	$decl_type = $1;
>  	$declaration_name = $2;
> -	my $members = $3;
> +	$members = $3;
> +    } elsif ($x =~ /typedef\s+(struct|union)\s*\{(.*)\}(?:\s*(?:__packed|__aligned|____cacheline_aligned_in_smp|____cacheline_aligned|__attribute__\s*\(\([a-z0-9,_\s\(\)]*\)\)))*\s*(\w*)\s*;/) {

So this isn't your fault, but these regexes are really getting out of
hand.  I would *really* like to see some effort made into making this
code more understandable / maintainable as we tweak this stuff.  So:

 - Splitting out the common part, as suggested by Lukas, would be really
   useful.  That would also avoid the problem of only occurrence being
   edited the next tine we add a new qualifier.

 - Splitting out other subsections of the regex and giving them symbolic
   names would also help.

 - We really could use some comments before these branches saying what
   they are doing; it is *not* obvious from the code.

See what I'm getting at here?

Thanks,

jon