lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201026080322.4d0b26f5@coco.lan>
Date:   Mon, 26 Oct 2020 08:03:22 +0100
From:   Mauro Carvalho Chehab <mchehab+huawei@...nel.org>
To:     Joe Perches <joe@...ches.com>
Cc:     Jonathan Corbet <corbet@....net>,
        Linux Doc Mailing List <linux-doc@...r.kernel.org>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 01/56] scripts: kernel-doc: fix typedef parsing

Em Fri, 23 Oct 2020 11:01:35 -0700
Joe Perches <joe@...ches.com> escreveu:

> On Fri, 2020-10-23 at 11:22 -0600, Jonathan Corbet wrote:
> > On Fri, 23 Oct 2020 18:32:48 +0200
> > Mauro Carvalho Chehab <mchehab+huawei@...nel.org> wrote:
> >   
> > > The include/linux/genalloc.h file defined this typedef:
> > > 
> > > 	typedef unsigned long (*genpool_algo_t)(unsigned long *map,unsigned long size,unsigned long start,unsigned int nr,void *data, struct gen_pool *pool, unsigned long start_addr);  
> []
> > > diff --git a/scripts/kernel-doc b/scripts/kernel-doc  
> []
> > >      # Parse function prototypes
> > > -    if ($x =~ /typedef\s+(\w+)\s*\(\*\s*(\w\S+)\s*\)\s*\((.*)\);/ ||
> > > +    if ($x =~ /typedef\s+(\w+\s*){1,}\(\*\s*(\w\S+)\s*\)\s*\((.*)\);/ ||  
> > 
> > I sure wish we could find a way to make all these regexes more
> > understandable and maintainable.  Reviewing a change like this is ... fun.  
> 
> Perhaps using some of the checkpatch regex definitions like:
> 
> $Type
> $Ident
> $balanced_parens
> 
> would help improve readability.

Well, this can help:


	my $typedef_type = qr { ((?:\w+\s+){1,}) }x;
	my $typedef_ident = qr { \*?\s*(\w\S+)\s* }x;
	my $typedef_args = qr { \s*\((.*)\); }x;

	my $typedef1 = qr { typedef\s+$typedef_type\($typedef_ident\)$typedef_args }x;
	my $typedef2 = qr { typedef\s+$typedef_type$typedef_ident$typedef_args }x;

	# Parse function typedef prototypes
	if ($x =~ $typedef1 || $x =~ $typedef2) {
		...

But, IMHO, this is as complicated as before, and makes harder
to test the regex outside kernel_doc (like using regex101).

A good thing is that it is easier to see the difference between
the two typedef regexes.

I'll place such optimization on a separate patch. This way, it
should be easier to decide later if this is worth or not.

Also, if we're willing to take such direction, it could make
sense to use the same regexes for matching type, identifier
and arguments inside the functions parser.

> And the regex above doesn't quite work for spacing after typedef.
> The regex should allow space between the open parenthesis and the *
> 
> 	typedef <Type> ( * <Ident> ) (args...);
> 
> And this regex does not find typedefs that use another typedef as <Ident> like:
> 
> arch/s390/include/asm/debug.h:typedef int (debug_header_proc_t) (debug_info_t *id,

True. I guess that, in order to properly handle it, we should use this:

    if ($x =~ /typedef\s+((?:\w+\s+){1,})\(\*?\s*(\w\S+)\s*\)\s*\((.*)\);/ ||
	$x =~ /typedef\s+((?:\w+\s+){1,})\s*\*?(\w\S+)\s*\s*\((.*)\);/) {

The first check should now parse everything properly:

	https://regex101.com/r/bPTm18/5

And the second regex should also get multi-word types when parenthesis
is not used, like:

	typedef unsigned int debug_header_proc_t (debug_info_t *id, struct debug_view *view, int area,debug_entry_t *entry, char *out_buf);
	typedef unsigned int *debug_header_proc_t (debug_info_t *id, struct debug_view *view, int area,debug_entry_t *entry, char *out_buf);


https://regex101.com/r/Y56X1X/1

Thanks,
Mauro

[PATCH] scripts: kernel-doc: fix typedef parsing

The include/linux/genalloc.h file defined this typedef:

	typedef unsigned long (*genpool_algo_t)(unsigned long *map,unsigned long size,unsigned long start,unsigned int nr,void *data, struct gen_pool *pool, unsigned long start_addr);

Because it has a type composite of two words (unsigned long),
the parser gets the typedef name wrong:

.. c:macro:: long

   **Typedef**: Allocation callback function type definition

Fix the regex in order to accept composite types when
defining a typedef for a function pointer.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@...nel.org>

diff --git a/scripts/kernel-doc b/scripts/kernel-doc
index 99cd8418ff8a..54832618eea0 100755
--- a/scripts/kernel-doc
+++ b/scripts/kernel-doc
@@ -1438,13 +1438,14 @@ sub dump_typedef($$) {
     $x =~ s@/\*.*?\*/@@gos;	# strip comments.
 
     # Parse function prototypes
-    if ($x =~ /typedef\s+(\w+)\s*\(\*\s*(\w\S+)\s*\)\s*\((.*)\);/ ||
-	$x =~ /typedef\s+(\w+)\s*(\w\S+)\s*\s*\((.*)\);/) {
+    if ($x =~ /typedef\s+((?:\w+\s+){1,})\(\*?\s*(\w\S+)\s*\)\s*\((.*)\);/ ||
+	$x =~ /typedef\s+((?:\w+\s+){1,})\s*\*?(\w\S+)\s*\s*\((.*)\);/) {
 
 	# Function typedefs
 	$return_type = $1;
 	$declaration_name = $2;
 	my $args = $3;
+	$return_type =~ s/\s+$//;
 
 	create_parameterlist($args, ',', $file, $declaration_name);
 




Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ