lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <35AFFA5A-B499-4D64-9E98-42B9A642EB0F@zytor.com>
Date:   Sat, 01 Feb 2020 14:29:35 -0800
From:   hpa@...or.com
To:     Wen Yang <wenyang@...ux.alibaba.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Borislav Petkov <bp@...en8.de>, Ingo Molnar <mingo@...hat.com>
CC:     x86@...nel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] x86/tsc: improve arithmetic division

On January 30, 2020 5:08:38 AM PST, Wen Yang <wenyang@...ux.alibaba.com> wrote:
>do_div() does a 64-by-32 division. Use div64_ul64() or div64_ul()
>instead of it if the divisor is 'ul64' or 'unsigned long', to avoid
>truncation to lower 32-bit.
>And as a nice side effect also cleans up the function a bit.
>
>Signed-off-by: Wen Yang <wenyang@...ux.alibaba.com>
>Cc: Thomas Gleixner <tglx@...utronix.de>
>Cc: Ingo Molnar <mingo@...hat.com>
>Cc: Borislav Petkov <bp@...en8.de>
>Cc: "H. Peter Anvin" <hpa@...or.com>
>Cc: x86@...nel.org
>Cc: linux-kernel@...r.kernel.org
>---
> arch/x86/kernel/tsc.c | 7 ++-----
> 1 file changed, 2 insertions(+), 5 deletions(-)
>
>diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
>index 7e322e2daaf5..4c0320e68699 100644
>--- a/arch/x86/kernel/tsc.c
>+++ b/arch/x86/kernel/tsc.c
>@@ -357,9 +357,7 @@ static unsigned long calc_pmtimer_ref(u64 deltatsc,
>u64 pm1, u64 pm2)
> 	pm2 -= pm1;
> 	tmp = pm2 * 1000000000LL;
> 	do_div(tmp, PMTMR_TICKS_PER_SEC);
>-	do_div(deltatsc, tmp);
>-
>-	return (unsigned long) deltatsc;
>+	return (unsigned long) div64_u64(deltatsc, tmp);
> }
> 
> #define CAL_MS		10
>@@ -778,8 +776,7 @@ static unsigned long
>pit_hpet_ptimer_calibrate_cpu(void)
> 		tsc_ref_min = min(tsc_ref_min, (unsigned long) tsc2);
> 
> 		/* Check the reference deviation */
>-		delta = ((u64) tsc_pit_min) * 100;
>-		do_div(delta, tsc_ref_min);
>+		delta = div64_ul(((u64) tsc_pit_min) * 100, tsc_ref_min);
> 
> 		/*
> 		 * If both calibration results are inside a 10% window

This is a *lot* more expensive on 32 bits (something like 10x) and as the output is truncated to unsigned long anyway, it is also unnecessary.

We don't use the remainder, so using do_div() is not merely unnecessary but almost certainly generates worse code: we are multiplying and then dividing by a constant, and most of the time gcc can optimize that into a single multiply/shift operation; otherwise we can do that optimization for it (see timeconst.bc.)

The one thing that gcc can't necessary do automatically is to know when a 64/32 → 32 division is safe; C semantics are truncation, but the CPU will trap. If it can turn it into a multiply then that problem obviously goes away.

So first I would test with regular / operators and see what code comes out.

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ