lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20211125141817.3541501-1-eric.dumazet@gmail.com>
Date:   Thu, 25 Nov 2021 06:18:17 -0800
From:   Eric Dumazet <eric.dumazet@...il.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     linux-kernel <linux-kernel@...r.kernel.org>,
        Eric Dumazet <edumazet@...gle.com>,
        Eric Dumazet <eric.dumazet@...il.com>,
        Noah Goldstein <goldstein.w.n@...il.com>,
        Alexander Duyck <alexanderduyck@...com>
Subject: [PATCH] x86/csum: fix initial seed for odd buffers

From: Eric Dumazet <edumazet@...gle.com>

When I folded do_csum() into csum_partial(), I missed that we
had to swap odd/even bytes from @sum argument.

This is because this swap will happen again at the end of the function.

[A, B, C, D] -> [B, A, D, C]

As far as Internet checksums (rfc 1071) are concerned, we can instead
rotate the whole 32bit value by 8 (or 24)

-> [D, A, B, C]

Note that I played with the idea of replacing this final swaping:

    result = from32to16(result);
    result = ((result >> 8) & 0xff) | ((result & 0xff) << 8);

With:

    result = ror32(result, 8);

But while the generated code was definitely better for the odd case,
run time cost for the more likely even case was not better for gcc.

gcc is replacing a well predicted conditional branch
with a cmov instruction after a ror instruction which adds
a cost canceling the cmov gain.

Many thanks to Noah Goldstein for reporting this issue.

Fixes: df4554cebdaa ("x86/csum: Rewrite/optimize csum_partial()")
Signed-off-by: Eric Dumazet <edumazet@...gle.com>
Reported-by: Noah Goldstein <goldstein.w.n@...il.com>
Cc: Peter Zijlstra (Intel) <peterz@...radead.org>
Cc: Alexander Duyck <alexanderduyck@...com>
---
 arch/x86/lib/csum-partial_64.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/lib/csum-partial_64.c b/arch/x86/lib/csum-partial_64.c
index 1eb8f2d11f7c785be624eba315fe9ca7989fd56d..40b527ba1da1f74b5dbc51ddd97a9ecad22ee5ae 100644
--- a/arch/x86/lib/csum-partial_64.c
+++ b/arch/x86/lib/csum-partial_64.c
@@ -41,6 +41,7 @@ __wsum csum_partial(const void *buff, int len, __wsum sum)
 	if (unlikely(odd)) {
 		if (unlikely(len == 0))
 			return sum;
+		temp64 = ror32((__force u32)sum, 8);
 		temp64 += (*(unsigned char *)buff << 8);
 		len--;
 		buff++;
-- 
2.34.0.rc2.393.gf8c9666880-goog

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ