lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20231219-get-maintainers-utf8-v3-2-f85a39e2265a@bang-olufsen.dk>
Date: Tue, 19 Dec 2023 02:25:15 +0100
From: Alvin Šipraga <alvin@...s.dk>
To: Joe Perches <joe@...ches.com>, 
 Linus Torvalds <torvalds@...ux-foundation.org>, 
 Andrew Morton <akpm@...ux-foundation.org>
Cc: Duje Mihanović <duje.mihanovic@...le.hr>, 
 Konstantin Ryabitsev <konstantin@...uxfoundation.org>, 
 linux-kernel@...r.kernel.org, 
 Alvin Šipraga <alsi@...g-olufsen.dk>
Subject: [PATCH v3 2/2] get_maintainer: remove stray punctuation when
 cleaning file emails

From: Alvin Šipraga <alsi@...g-olufsen.dk>

When parsing emails from .yaml files in particular, stray punctuation
such as a leading '-' can end up in the name. For example, consider a
common YAML section such as:

  maintainers:
    - devicetree@...r.kernel.org

This would previously be processed by get_maintainer.pl as:

  - <devicetree@...r.kernel.org>

Make the logic in clean_file_emails more robust by deleting any
sub-names which consist of common single punctuation marks before
proceeding to the best-effort name extraction logic. The output is then
correct:

  devicetree@...r.kernel.org

Some additional comments are added to the function to make things
clearer to future readers.

Link: https://lore.kernel.org/all/0173e76a36b3a9b4e7f324dd3a36fd4a9757f302.camel@perches.com/
Suggested-by: Joe Perches <joe@...ches.com>
Signed-off-by: Alvin Šipraga <alsi@...g-olufsen.dk>
---
 scripts/get_maintainer.pl | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/scripts/get_maintainer.pl b/scripts/get_maintainer.pl
index dac38c6e3b1c..ee1aed7e090c 100755
--- a/scripts/get_maintainer.pl
+++ b/scripts/get_maintainer.pl
@@ -2462,11 +2462,17 @@ sub clean_file_emails {
     foreach my $email (@file_emails) {
 	$email =~ s/[\(\<\{]{0,1}([A-Za-z0-9_\.\+-]+\@[A-Za-z0-9\.-]+)[\)\>\}]{0,1}/\<$1\>/g;
 	my ($name, $address) = parse_email($email);
-	if ($name eq '"[,\.]"') {
-	    $name = "";
-	}
 
+	# Strip quotes for easier processing, format_email will add them back
+	$name =~ s/^"(.*)"$/$1/;
+
+	# Split into name-like parts and remove stray punctuation particles
 	my @nw = split(/[^\p{L}\'\,\.\+-]/, $name);
+	@nw = grep(!/^[\'\,\.\+-]$/, @nw);
+
+	# Make a best effort to extract the name, and only the name, by taking
+	# only the last two names, or in the case of obvious initials, the last
+	# three names.
 	if (@nw > 2) {
 	    my $first = $nw[@nw - 3];
 	    my $middle = $nw[@nw - 2];
@@ -2480,18 +2486,16 @@ sub clean_file_emails {
 	    } else {
 		$name = "$middle $last";
 	    }
+	} else {
+	    $name = "@nw";
 	}
 
 	if (substr($name, -1) =~ /[,\.]/) {
 	    $name = substr($name, 0, length($name) - 1);
-	} elsif (substr($name, -2) =~ /[,\.]"/) {
-	    $name = substr($name, 0, length($name) - 2) . '"';
 	}
 
 	if (substr($name, 0, 1) =~ /[,\.]/) {
 	    $name = substr($name, 1, length($name) - 1);
-	} elsif (substr($name, 0, 2) =~ /"[,\.]/) {
-	    $name = '"' . substr($name, 2, length($name) - 2);
 	}
 
 	my $fmt_email = format_email($name, $address, $email_usename);

-- 
2.43.0


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ