lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <284CBC37-0F4F-4077-A172-7E47C90B8B43@goldelico.com>
Date:   Wed, 21 Apr 2021 18:05:37 +0200
From:   "H. Nikolaus Schaller" <hns@...delico.com>
To:     "Maciej W. Rozycki" <macro@...am.me.uk>
Cc:     Arnd Bergmann <arnd@...db.de>,
        Thomas Bogendoerfer <tsbogend@...ha.franken.de>,
        Huacai Chen <chenhuacai@...nel.org>,
        Huacai Chen <chenhuacai@...ngson.cn>,
        Jiaxun Yang <jiaxun.yang@...goat.com>,
        linux-arch@...r.kernel.org, linux-mips@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH 4/4] MIPS: Avoid DIVU in `__div64_32' is result would be zero


> Am 20.04.2021 um 04:50 schrieb Maciej W. Rozycki <macro@...am.me.uk>:
> 
> We already check the high part of the divident against zero to avoid the 

nit-picking: s/divident/dividend/

(seems to come from from Latin "dividendum" = the number that is to be divided).

> costly DIVU instruction in that case, needed to reduce the high part of 
> the divident, so we may well check against the divisor instead and set 
> the high part of the quotient to zero right away.  We need to treat the 
> high part the divident in that case though as the remainder that would 
> be calculated by the DIVU instruction we avoided.
> 
> This has passed correctness verification with test_div64 and reduced the
> module's average execution time down to 1.0445s and 0.2619s from 1.0668s
> and 0.2629s respectively for an R3400 CPU @40MHz and a 5Kc CPU @160MHz.

Impressive.

> 
> Signed-off-by: Maciej W. Rozycki <macro@...am.me.uk>
> ---
> I have made an experimental change on top of this to put `__div64_32' out 
> of line, and that increases the averages respectively up to 1.0785s and 
> 0.2705s.  Not a terrible loss, especially compared to generic times quoted 
> with 3/4, but still, so I think it would best be made where optimising for 
> size, as noted in the cover letter.
> ---
> arch/mips/include/asm/div64.h |    6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
> 
> Index: linux-3maxp-div64/arch/mips/include/asm/div64.h
> ===================================================================
> --- linux-3maxp-div64.orig/arch/mips/include/asm/div64.h
> +++ linux-3maxp-div64/arch/mips/include/asm/div64.h
> @@ -68,9 +68,11 @@
> 									\
> 	__high = __div >> 32;						\
> 	__low = __div;							\
> -	__upper = __high;						\
> 									\
> -	if (__high) {							\
> +	if (__high < __radix) {						\
> +		__upper = __high;					\
> +		__high = 0;						\
> +	} else {							\
> 		__asm__("divu	$0, %z1, %z2"				\
> 		: "=x" (__modquot)					\
> 		: "Jr" (__high), "Jr" (__radix));			\

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ