[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250210170354.18c04f7c@sal.lan>
Date: Mon, 10 Feb 2025 17:03:54 +0100
From: Mauro Carvalho Chehab <mchehab+huawei@...nel.org>
To: Jonathan Corbet <corbet@....net>
Cc: Linux Doc Mailing List <linux-doc@...r.kernel.org>, Greg Kroah-Hartman
<gregkh@...uxfoundation.org>, linux-kernel@...r.kernel.org
Subject: Re: [RFC v2 18/38] docs: sphinx/kernel_abi: use AbiParser directly
Em Mon, 10 Feb 2025 07:40:02 -0700
Jonathan Corbet <corbet@....net> escreveu:
> Mauro Carvalho Chehab <mchehab+huawei@...nel.org> writes:
>
> > I took a look on Markus work: it was licensed under GPL 3.0 and it was
> > written in 2016. There were several changes on kerneldoc since them,
> > including the addition of a regex that it is not compatible with
> > Python re[1]:
> >
> > $members =~ s/\bSTRUCT_GROUP(\(((?:(?>[^)(]+)|(?1))*)\))[^;]*;/$2/gos;
> >
> > This one use:
> >
> > - recursive patterns: ?1
> > - atomic grouping (?>...)
> >
> > Also, it is hard to map what he does with the existing script. I'm
> > opting to write a new script from scratch.
>
> That's fine, I just wanted to be sure you'd had a chance to look at
> it...
>
> > Another option would be to re-implement such regexes without using
> > such advanced patterns.
>
> Seems like a preferred option if that can be done. Banging one's head
> against all those regexes is often the hardest part of dealing with that
> script; anything that makes it simpler is welcome.
Agreed. This one, in special, is very hard for me to understand, as I
never used recursive patterns or atomic grouping. The net result of
the struct_group*() handling is that it removes some parameters when
generating the function prototype. This is done using a complex logic
on two steps:
# unwrap struct_group():
# - first eat non-declaration parameters and rewrite for final match
# - then remove macro, outer parens, and trailing semicolon
$members =~ s/\bstruct_group\s*\(([^,]*,)/STRUCT_GROUP(/gos;
$members =~ s/\bstruct_group_attr\s*\(([^,]*,){2}/STRUCT_GROUP(/gos;
$members =~ s/\bstruct_group_tagged\s*\(([^,]*),([^,]*),/struct $1 $2; STRUCT_GROUP(/gos;
$members =~ s/\b__struct_group\s*\(([^,]*,){3}/STRUCT_GROUP(/gos;
$members =~ s/\bSTRUCT_GROUP(\(((?:(?>[^)(]+)|(?1))*)\))[^;]*;/$2/gos;
The first step basically eliminates some members of the function. At the
places I checked, the second step was just removing parenthesis from the
macro (and the STRUCT_GROUP name).
I suspect that the same result could be done with a much simpler expression
like:
$members =~ s/\bSTRUCT_GROUP\((.*)\)[^;]*;/$2/gos;
But maybe there are some corner cases that would cause such simpler
regex to fail.
-
On a side note, the "o" flag used there at kernel-doc is described
as[1]:
"o - pretend to optimize your code, but actually introduce bugs"
I wonder if we're reaching any issues on kernel docs due to that ;-)
[1] https://perldoc.perl.org/perlre::#Other-Modifiers
>
> Thanks,
>
> jon
Powered by blists - more mailing lists