[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0173e76a36b3a9b4e7f324dd3a36fd4a9757f302.camel@perches.com>
Date: Fri, 15 Dec 2023 10:30:52 -0800
From: Joe Perches <joe@...ches.com>
To: Alvin Šipraga <ALSI@...g-olufsen.dk>
Cc: Alvin Šipraga <alvin@...s.dk>, Linus Torvalds
<torvalds@...ux-foundation.org>, Duje Mihanović
<duje.mihanovic@...le.hr>, Konstantin Ryabitsev
<konstantin@...uxfoundation.org>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, Shawn Guo <shawnguo@...nel.org>, Andrew
Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH v2] get_maintainer: correctly parse UTF-8 encoded names
in files
On Fri, 2023-12-15 at 10:30 +0000, Alvin Šipraga wrote:
> On Thu, Dec 14, 2023 at 07:57:54AM -0800, Joe Perches wrote:
> > On Thu, 2023-12-14 at 16:06 +0100, Alvin Šipraga wrote:
> > > @@ -442,7 +443,7 @@ sub maintainers_in_file {
> > > my $text = do { local($/) ; <$f> };
> > > close($f);
> > >
> > > - my @poss_addr = $text =~ m$[A-Za-zÀ-ÿ\"\' \,\.\+-]*\s*[\,]*\s*[\(\<\{]{0,1}[A-Za-z0-9_\.\+-]+\@[A-Za-z0-9\.-]+\.[A-Za-z0-9]+[\)\>\}]{0,1}$g;
> > > + my @poss_addr = $text =~ m$[\p{L}\"\' \,\.\+-]*\s*[\,]*\s*[\(\<\{]{0,1}[A-Za-z0-9_\.\+-]+\@[A-Za-z0-9\.-]+\.[A-Za-z0-9]+[\)\>\}]{0,1}$g;
> > > push(@file_emails, clean_file_emails(@poss_addr));
Hi again Alvin.
Separate issue, but on the one .yaml file I tried:
$ ./scripts/get_maintainer.pl Documentation/devicetree/bindings/serial/8250.yaml
Greg Kroah-Hartman <gregkh@...uxfoundation.org> (supporter:TTY LAYER AND SERIAL DRIVERS)
Jiri Slaby <jirislaby@...nel.org> (supporter:TTY LAYER AND SERIAL DRIVERS)
Rob Herring <robh+dt@...nel.org> (maintainer:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS)
Krzysztof Kozlowski <krzysztof.kozlowski+dt@...aro.org> (maintainer:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS)
Conor Dooley <conor+dt@...nel.org> (maintainer:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS)
Lubomir Rintel <lkundrak@...sk> (in file)
- <devicetree@...r.kernel.org> (in file)
linux-kernel@...r.kernel.org (open list:TTY LAYER AND SERIAL DRIVERS)
linux-serial@...r.kernel.org (open list:TTY LAYER AND SERIAL DRIVERS)
devicetree@...r.kernel.org (open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS)
Note the single '-' in the "name" portion of devicetree@...r.kernel.org
Maybe clean_file_emails needs some better name cleansing code.
> > Rather than open _all_ files in utf-8, perhaps the block
> > that opens a specific file to find maintainers
> >
> > sub maintainers_in_file {
> > my ($file) = @_;
> >
> > return if ($file =~ m@\bMAINTAINERS$@);
> >
> > if (-f $file && ($email_file_emails || $file =~ /\.yaml$/)) {
> > open(my $f, '<', $file)
> > or die "$P: Can't open $file: $!\n";
> > my $text = do { local($/) ; <$f> };
> > close($f);
> > ...
> >
> > should change the
> >
> > open(my $f...
> > to
> > use open qw(:std :encoding(UTF-8));
> > open(my $f...
>
> Yes, this also works for parsing the name in an arbitrary file. But with the
> change you suggest above, the script then corrupts my name when it is lifted
> from MAINTAINERS (!?):
>
> $ ./scripts/get_maintainer.pl -f drivers/net/dsa/realtek/ | grep alsi
> "Alvin Å ipraga" <alsi@...g-olufsen.dk> (maintainer:REALTEK RTL83xx SMI DSA ROUTER CHIPS)
Curious. Let me see if I can figure out why that happens.
> If you are still unconvinced then I will gladly send a v3 patching the two cases
> we have discussed (read_maintainer_file() and maintainers_in_file()).
No rush.
> > And unrelated and secondarily, perhaps the
> > $file =~ /\.yaml$/
> > test should be
> > $file =~ /\.(?:yaml|dtsi?)$/
> >
> > to also find any maintainer address in the dts* files
> >
> > https://lore.kernel.org/lkml/20231028174656.GA3310672@bill-the-cat/T/
>
> Is this supposed to parse the "Copyright (c) 20xx John Doe <foo@....toto>" in
> the .dts* files?
Yes, just as it would and does for .yaml files.
$ git grep -P -i 'copy.*\<\w+\@\w+\.\w+\>' -- '*.yaml'
Documentation/devicetree/bindings/display/bridge/chrontel,ch7033.yaml:# Copyright (C) 2019,2020 Lubomir Rintel <lkundrak@...sk>
Documentation/devicetree/bindings/media/marvell,mmp2-ccic.yaml:# Copyright 2019,2020 Lubomir Rintel <lkundrak@...sk>
Documentation/devicetree/bindings/misc/olpc,xo1.75-ec.yaml:# Copyright (C) 2019,2020 Lubomir Rintel <lkundrak@...sk>
Documentation/devicetree/bindings/phy/allwinner,sun50i-h6-usb3-phy.yaml:# Copyright 2019 Ondrej Jirman <megous@...ous.com>
Documentation/devicetree/bindings/phy/marvell,mmp3-hsic-phy.yaml:# Copyright 2019 Lubomir Rintel <lkundrak@...sk>
Documentation/devicetree/bindings/phy/marvell,mmp3-usb-phy.yaml:# Copyright 2019,2020 Lubomir Rintel <lkundrak@...sk>
Documentation/devicetree/bindings/reset/bitmain,bm1880-reset.yaml:# Copyright 2019 Manivannan Sadhasivam <mani@...nel.org>
Documentation/devicetree/bindings/reset/marvell,berlin2-reset.yaml:# Copyright 2015 Antoine Tenart <atenart@...nel.org>
Documentation/devicetree/bindings/reset/qca,ar7100-reset.yaml:# Copyright 2015 Alban Bedel <albeu@...e.fr>
Documentation/devicetree/bindings/serial/8250.yaml:# Copyright 2020 Lubomir Rintel <lkundrak@...sk>
Documentation/devicetree/bindings/spi/marvell,mmp2-ssp.yaml:# Copyright 2019,2020 Lubomir Rintel <lkundrak@...sk>
Documentation/devicetree/bindings/usb/marvell,pxau2o-ehci.yaml:# Copyright 2019,2020 Lubomir Rintel <lkundrak@...sk>
> But sure, I can do a resend of Shawn's original patch
> separately if you like.
Yes please. Make sure to cc Andrew Morton.
Powered by blists - more mailing lists