[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d52110471b332b047777616c762b086ee662225e.camel@perches.com>
Date: Sat, 17 Sep 2022 07:11:52 -0700
From: Joe Perches <joe@...ches.com>
To: Janne Grunau <j@...nau.net>
Cc: linux-kernel@...r.kernel.org
Subject: Re: [PATCH] get_maintainer: Extend matched name characters in
maintainers_in_file()
On Fri, 2022-09-16 at 10:47 +0200, Janne Grunau wrote:
> Extend the regexp matching name characters to cover Unicode blocks Latin
> Extended-A and Extended-B.
> Fixes 'scripts/get_maintainer.pl -f' for
> 'Documentation/devicetree/bindings/clock/apple,nco.yaml'.
>
> Signed-off-by: Janne Grunau <j@...nau.net>
>
> ---
> This still excludes Greek and Cyrilic characters which should be
> expected in names as well. I tried to use '\p{L}' to match all Unicode
> letters but couldn't get it to work. Feel free understand this as bug
> report with an incomplete fix.
Maybe use \p{XPosixAlpha} ?
but I don't know what version of perl introduced this.
> diff --git a/scripts/get_maintainer.pl b/scripts/get_maintainer.pl
[]
> @@ -442,7 +442,7 @@ sub maintainers_in_file {
> my $text = do { local($/) ; <$f> };
> close($f);
>
> - my @poss_addr = $text =~ m$[A-Za-zÀ-ÿ\"\' \,\.\+-]*\s*[\,]*\s*[\(\<\{]{0,1}[A-Za-z0-9_\.\+-]+\@[A-Za-z0-9\.-]+\.[A-Za-z0-9]+[\)\>\}]{0,1}$g;
> + my @poss_addr = $text =~ m$[A-Za-zÀ-ɏ\"\' \,\.\+-]*\s*[\,]*\s*[\(\<\{]{0,1}[A-Za-z0-9_\.\+-]+\@[A-Za-z0-9\.-]+\.[A-Za-z0-9]+[\)\>\}]{0,1}$g;
my @poss_addr = $text =~ m$[\p{XPosixAlpha}\"\' \,\.\+-]*\s*[\,]*\s*[\(\<\{]{0,1}[A-Za-z0-9_\.\+-]+\@[A-Za-z0-9\.-]+\.[A-Za-z0-9]+[\)\>\}]{0,1}$g;
?
> push(@file_emails, clean_file_emails(@poss_addr));
> }
> }
> @@ -2460,7 +2460,7 @@ sub clean_file_emails {
> $name = "";
> }
>
> - my @nw = split(/[^A-Za-zÀ-ÿ\'\,\.\+-]/, $name);
> + my @nw = split(/[^A-Za-zÀ-ɏ\'\,\.\+-]/, $name);
Maybe here too
> + my @nw = split(/[^\p{XPosixAlpha}\'\,\.\+-]/, $name);
Dunno haven't tested. Maybe you care to test?
Powered by blists - more mailing lists