lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.21.2104200331110.44318@angie.orcam.me.uk>
Date:   Tue, 20 Apr 2021 04:50:48 +0200 (CEST)
From:   "Maciej W. Rozycki" <macro@...am.me.uk>
To:     Arnd Bergmann <arnd@...db.de>,
        Thomas Bogendoerfer <tsbogend@...ha.franken.de>
cc:     Huacai Chen <chenhuacai@...nel.org>,
        Huacai Chen <chenhuacai@...ngson.cn>,
        Jiaxun Yang <jiaxun.yang@...goat.com>,
        linux-arch@...r.kernel.org, linux-mips@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: [PATCH 4/4] MIPS: Avoid DIVU in `__div64_32' is result would be
 zero

We already check the high part of the divident against zero to avoid the 
costly DIVU instruction in that case, needed to reduce the high part of 
the divident, so we may well check against the divisor instead and set 
the high part of the quotient to zero right away.  We need to treat the 
high part the divident in that case though as the remainder that would 
be calculated by the DIVU instruction we avoided.

This has passed correctness verification with test_div64 and reduced the
module's average execution time down to 1.0445s and 0.2619s from 1.0668s
and 0.2629s respectively for an R3400 CPU @40MHz and a 5Kc CPU @160MHz.

Signed-off-by: Maciej W. Rozycki <macro@...am.me.uk>
---
I have made an experimental change on top of this to put `__div64_32' out 
of line, and that increases the averages respectively up to 1.0785s and 
0.2705s.  Not a terrible loss, especially compared to generic times quoted 
with 3/4, but still, so I think it would best be made where optimising for 
size, as noted in the cover letter.
---
 arch/mips/include/asm/div64.h |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Index: linux-3maxp-div64/arch/mips/include/asm/div64.h
===================================================================
--- linux-3maxp-div64.orig/arch/mips/include/asm/div64.h
+++ linux-3maxp-div64/arch/mips/include/asm/div64.h
@@ -68,9 +68,11 @@
 									\
 	__high = __div >> 32;						\
 	__low = __div;							\
-	__upper = __high;						\
 									\
-	if (__high) {							\
+	if (__high < __radix) {						\
+		__upper = __high;					\
+		__high = 0;						\
+	} else {							\
 		__asm__("divu	$0, %z1, %z2"				\
 		: "=x" (__modquot)					\
 		: "Jr" (__high), "Jr" (__radix));			\

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ