[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20180718144222.fc567d972447348b419edcd9@linux-foundation.org>
Date: Wed, 18 Jul 2018 14:42:22 -0700
From: Andrew Morton <akpm@...ux-foundation.org>
To: Geert Uytterhoeven <geert+renesas@...der.be>
Cc: Andy Whitcroft <apw@...onical.com>, Joe Perches <joe@...ches.com>,
Stephen Rothwell <sfr@...b.auug.org.au>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] checkpatch: Only encode UTF-8 quoted printable mail
headers
On Wed, 18 Jul 2018 16:52:54 +0200 Geert Uytterhoeven <geert+renesas@...der.be> wrote:
> As PERL uses its own internal character encoding, always calling
> encode("utf8", ...) on the author name may cause corruption, leading to
> an author signoff mismatch.
>
> This happens in the following cases:
> - If a patch is in ISO-8859, and contains a non-ASCII author name in
> the From: line, it is converted to UTF-8, while the Signed-off-by
> line will still be in ISO-8859.
> - If a patch is in UTF-8, and contains a non-ASCII author name in the
> body (not header) From: line, it is assumed to be encoded in PERL's
> internal character encoding, and converted to UTF-8 incorrectly,
> while the Signed-off-by line will be in real UTF-8.
>
> Fix this by only doing the encode step if the From: line used UTF-8
> quoted printable encoding.
Works for me, thanks.
Relatedly, would it be worth adding a checkpatch warning if a patch
contains anything other than ASCII or UTF-8?
I added this to my little local patch-checking script.
if ! file $p | grep -q -P "ASCII text|Unicode text"
then
echo $p: weird charset
fi
Powered by blists - more mailing lists