lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220918203217.GG4024@jannau.net>
Date:   Sun, 18 Sep 2022 22:32:17 +0200
From:   Janne Grunau <j@...nau.net>
To:     Joe Perches <joe@...ches.com>
Cc:     linux-kernel@...r.kernel.org,
        Martin Povišer <povik+lin@...ebit.org>
Subject: Re: [PATCH] get_maintainer: Extend matched name characters in
 maintainers_in_file()

On 2022-09-18 10:03:17 -0700, Joe Perches wrote:
> On Sat, 2022-09-17 at 07:11 -0700, Joe Perches wrote:
> > On Fri, 2022-09-16 at 10:47 +0200, Janne Grunau wrote:
> > > Extend the regexp matching name characters to cover Unicode blocks Latin
> > > Extended-A and Extended-B.
> > > Fixes 'scripts/get_maintainer.pl -f' for
> > > 'Documentation/devicetree/bindings/clock/apple,nco.yaml'.
> > > 
> > > Signed-off-by: Janne Grunau <j@...nau.net>
> > > 
> > > ---
> > > This still excludes Greek and Cyrilic characters which should be
> > > expected in names as well. I tried to use '\p{L}' to match all Unicode
> > > letters but couldn't get it to work. Feel free understand this as bug
> > > report with an incomplete fix.
> > 
> > Maybe use \p{XPosixAlpha} ?
> > 
> > but I don't know what version of perl introduced this.
> > 
> > > diff --git a/scripts/get_maintainer.pl b/scripts/get_maintainer.pl
> > []
> > > @@ -442,7 +442,7 @@ sub maintainers_in_file {
> > >  	my $text = do { local($/) ; <$f> };
> > >  	close($f);
> > >  
> > > -	my @poss_addr = $text =~ m$[A-Za-zÀ-ÿ\"\' \,\.\+-]*\s*[\,]*\s*[\(\<\{]{0,1}[A-Za-z0-9_\.\+-]+\@[A-Za-z0-9\.-]+\.[A-Za-z0-9]+[\)\>\}]{0,1}$g;
> > > +	my @poss_addr = $text =~ m$[A-Za-zÀ-ɏ\"\' \,\.\+-]*\s*[\,]*\s*[\(\<\{]{0,1}[A-Za-z0-9_\.\+-]+\@[A-Za-z0-9\.-]+\.[A-Za-z0-9]+[\)\>\}]{0,1}$g;
> > 
> > 	my @poss_addr = $text =~ m$[\p{XPosixAlpha}\"\' \,\.\+-]*\s*[\,]*\s*[\(\<\{]{0,1}[A-Za-z0-9_\.\+-]+\@[A-Za-z0-9\.-]+\.[A-Za-z0-9]+[\)\>\}]{0,1}$g;
> 
> Using variations of \p{posix} doesn't seem to work for at least perl 5.34.
> 
> \p{print} seems to work for Documentation/devicetree/bindings/clock/apple,nco.yaml,
> but I don't know how fragile it is.
> 
> \p{print} might be too greedy...

It is, it produces following diff (checking all files in 
Documentation/devicetree/bindings):
-Lubomir Rintel <lkundrak@...sk> (in file)
+"Copyright 2019,2020 Lubomir Rintel" <lkundrak@...sk> (in file)

There are multiple hits of this form. The main issue is that \p{print} 
includes space. That however fixes many names with 3 parts.

It still fails for "Rafał Miłecki <rafal@...ecki.pl>" which my change 
handles correctly.

I'm testing with perl 5.36

> ---
>  scripts/get_maintainer.pl | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/scripts/get_maintainer.pl b/scripts/get_maintainer.pl
> index ab123b498fd9..790112c3e1d7 100755
> --- a/scripts/get_maintainer.pl
> +++ b/scripts/get_maintainer.pl
> @@ -442,7 +442,7 @@ sub maintainers_in_file {
>  	my $text = do { local($/) ; <$f> };
>  	close($f);
>  
> -	my @poss_addr = $text =~ m$[A-Za-zÀ-ÿ\"\' \,\.\+-]*\s*[\,]*\s*[\(\<\{]{0,1}[A-Za-z0-9_\.\+-]+\@[A-Za-z0-9\.-]+\.[A-Za-z0-9]+[\)\>\}]{0,1}$g;
> +	my @poss_addr = $text =~ m$[\p{print}\"\' \,\.\+-]*\s*[\,]*\s*[\(\<\{]{0,1}[A-Za-z0-9_\.\+-]+\@[A-Za-z0-9\.-]+\.[A-Za-z0-9]+[\)\>\}]{0,1}$g;
>  	push(@file_emails, clean_file_emails(@poss_addr));
>      }
>  }
> @@ -2456,11 +2456,12 @@ sub clean_file_emails {
>      foreach my $email (@file_emails) {
>  	$email =~ s/[\(\<\{]{0,1}([A-Za-z0-9_\.\+-]+\@[A-Za-z0-9\.-]+)[\)\>\}]{0,1}/\<$1\>/g;
>  	my ($name, $address) = parse_email($email);
> +	$name =~ s/^\p{space}*\p{punct}*\p{space}*//;

This change is useful independently of the name regexp as it rejects
'- <email@...r.ess>' (yaml list items) as valid name, email combination.

Janne

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ