[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <35AFFA5A-B499-4D64-9E98-42B9A642EB0F@zytor.com>
Date: Sat, 01 Feb 2020 14:29:35 -0800
From: hpa@...or.com
To: Wen Yang <wenyang@...ux.alibaba.com>,
Thomas Gleixner <tglx@...utronix.de>,
Borislav Petkov <bp@...en8.de>, Ingo Molnar <mingo@...hat.com>
CC: x86@...nel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] x86/tsc: improve arithmetic division
On January 30, 2020 5:08:38 AM PST, Wen Yang <wenyang@...ux.alibaba.com> wrote:
>do_div() does a 64-by-32 division. Use div64_ul64() or div64_ul()
>instead of it if the divisor is 'ul64' or 'unsigned long', to avoid
>truncation to lower 32-bit.
>And as a nice side effect also cleans up the function a bit.
>
>Signed-off-by: Wen Yang <wenyang@...ux.alibaba.com>
>Cc: Thomas Gleixner <tglx@...utronix.de>
>Cc: Ingo Molnar <mingo@...hat.com>
>Cc: Borislav Petkov <bp@...en8.de>
>Cc: "H. Peter Anvin" <hpa@...or.com>
>Cc: x86@...nel.org
>Cc: linux-kernel@...r.kernel.org
>---
> arch/x86/kernel/tsc.c | 7 ++-----
> 1 file changed, 2 insertions(+), 5 deletions(-)
>
>diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
>index 7e322e2daaf5..4c0320e68699 100644
>--- a/arch/x86/kernel/tsc.c
>+++ b/arch/x86/kernel/tsc.c
>@@ -357,9 +357,7 @@ static unsigned long calc_pmtimer_ref(u64 deltatsc,
>u64 pm1, u64 pm2)
> pm2 -= pm1;
> tmp = pm2 * 1000000000LL;
> do_div(tmp, PMTMR_TICKS_PER_SEC);
>- do_div(deltatsc, tmp);
>-
>- return (unsigned long) deltatsc;
>+ return (unsigned long) div64_u64(deltatsc, tmp);
> }
>
> #define CAL_MS 10
>@@ -778,8 +776,7 @@ static unsigned long
>pit_hpet_ptimer_calibrate_cpu(void)
> tsc_ref_min = min(tsc_ref_min, (unsigned long) tsc2);
>
> /* Check the reference deviation */
>- delta = ((u64) tsc_pit_min) * 100;
>- do_div(delta, tsc_ref_min);
>+ delta = div64_ul(((u64) tsc_pit_min) * 100, tsc_ref_min);
>
> /*
> * If both calibration results are inside a 10% window
This is a *lot* more expensive on 32 bits (something like 10x) and as the output is truncated to unsigned long anyway, it is also unnecessary.
We don't use the remainder, so using do_div() is not merely unnecessary but almost certainly generates worse code: we are multiplying and then dividing by a constant, and most of the time gcc can optimize that into a single multiply/shift operation; otherwise we can do that optimization for it (see timeconst.bc.)
The one thing that gcc can't necessary do automatically is to know when a 64/32 → 32 division is safe; C semantics are truncation, but the CPU will trap. If it can turn it into a multiply then that problem obviously goes away.
So first I would test with regular / operators and see what code comes out.
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
Powered by blists - more mailing lists