linux-kernel - Re: [PATCH] checkpatch: Flag spam header (X-Spam-Report) to prevent spurious warnings

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:   Wed, 22 Mar 2017 23:01:33 -0700
From:   Darren Hart <dvhart@...radead.org>
To:     Joe Perches <joe@...ches.com>
Cc:     "John 'Warthog9' Hawley (VMware)" <warthog9@...lescrag.net>,
        linux-kernel@...r.kernel.org, Andy Whitcroft <apw@...onical.com>
Subject: Re: [PATCH] checkpatch: Flag spam header (X-Spam-Report) to prevent
 spurious warnings

On Wed, Mar 22, 2017 at 11:17:33AM -0700, Joe Perches wrote:
> On Wed, 2017-03-22 at 08:25 -0700, Darren Hart wrote:
> > On Tue, Mar 21, 2017 at 11:31:08AM -0700, Joe Perches wrote:
> > > On Tue, 2017-03-21 at 09:30 -0700, John 'Warthog9' Hawley (VMware) wrote:
> > > > Spamassassin sticks a long (~79 character) long string after a
> > > > line that has a single space in it. The line with space causes
> > > > checkpatch to erroniously think that it's in the content body, as
> > > > opposed to headers and thus flag a mail header as an unwrapped long
> > > > comment line.
> > > 
> > > If the spammassassin header is like
> > > 
> > > email-header-n: foo
> > > email-header-m: bar
> > >  
> > > X-Spam-Report: bar
> > 
> > The specific content of the X-Spam-Report that triggers this for me,
> > from this patch for example, is:
> > 
> > === 8< ===
> > X-Spam-Report: SpamAssassin version 3.4.1 on casper.infradead.org summary:
> >  Content analysis details:   (-1.9 points, 5.0 required)
> >  
> >   pts rule name              description
> >  ---- ---------------------- --------------------------------------------------
> >  -1.9 BAYES_00               BODY: Bayes spam probability is 0 to 1%
> >                              [score: 0.0000]
> > X-TUID: alGBIuPZmqOj
> > 
> > === >8 ===
> > 
> > The long ---- ----... line is over 75 characters and triggers the test
> > for long commit_log lines.
> > 
> > > 
> > > Does that form follow rfc 5322?
> > 
> > By my reading, this is governed by the long header fields defined by
> > 2.2.3, with whitespace folding defined as "a CRLF may be inserted before
> > any WSP."
> > 
> > > 
> > > If it does then any email header could have that
> > > form and the header wrapping test should be
> > 
> > Yes, agreed.
> > 
> > So the logic we want is:
> > 
> > If we are in headers and we detect a CRLF and the next line starts with a WSP,
> > then we are still in headers (and therefor not in the commit log). The CRLF
> > information does not appear to be available as it is replaced with just \n.
> > 
> > > updated from
> > > 
> > > 		if ($in_header_lines && $realfile =~ /^$/ &&
> > > 		    !($rawline =~ /^\s+\S/ ||
> > > 		      $rawline =~ /^(commit\b|from\b|[\w-]+:).*$/i)) {
> > > 			$in_header_lines = 0;
> > > 			$in_commit_log = 1;
> > > 			$has_commit_log = 1;
> > > 		}
> > > 
> > > to something like
> > > 
> > > 		if ($in_header_lines && $realfile =~ /^$/ &&
> > > 		    !($rawline =~ /^ (?:\s*\S|$)/ ||
> > 
> > Hrm... lines that start with maybe a space followed by a : ... Why did you
> > introduce that part of the check?
> 
> The regex doesn't care about colons.
> It's a perl non-capturing group.
> https://perldoc.perl.org/perlretut.html#Non-capturing-groupings

Aha, thank you for the pointer.

> 
> > Looking at this more closely, I was also not clear why the original test looked
> > for several spaces followed by non-space. What case is this for?
> 
> Not several spaces, one or more spaces then a non-space.
> The only change here is allowing an initial space followed
> by either:
> 
> 	1: optional spaces, then non-space.
> 	2: EOL
> 
> I supposed you could argue that case 2 should
> also allow optional spaces before EOL and the
> test should be
> 
> 		if ($in_header_lines && $realfile =~ /^$/ &&
> 		    !($rawline =~ /^\s+(?:\S|$)/ ||

I still haven't figured out why we test for this specific set of patterns. Why
is a line that starts with a space and ends with a newline considered still
in_header_lines. Or more specifically, why aren't we just testing for an empty
line (RFC 5322 Section 2.1, defining the separation of headers and the body).

-- 
Darren Hart
VMware Open Source Technology Center