linux-kernel - Re: [PATCH] time/sched_clock: Allow architecture to override cyc_to

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAAhV-H6NeB9Or=fc31r7teFdbeUZcJNh3VbGNXCtTbA7L38x3g@mail.gmail.com>
Date:   Thu, 18 Nov 2021 16:34:08 +0800
From:   Huacai Chen <chenhuacai@...il.com>
To:     John Stultz <john.stultz@...aro.org>
Cc:     Huacai Chen <chenhuacai@...ngson.cn>,
        Thomas Gleixner <tglx@...utronix.de>,
        Stephen Boyd <sboyd@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Xuefeng Li <lixuefeng@...ngson.cn>,
        Jiaxun Yang <jiaxun.yang@...goat.com>
Subject: Re: [PATCH] time/sched_clock: Allow architecture to override cyc_to_ns()

Hi, John,

On Wed, Nov 17, 2021 at 4:31 AM John Stultz <john.stultz@...aro.org> wrote:
>
> On Tue, Nov 16, 2021 at 3:37 AM Huacai Chen <chenhuacai@...il.com> wrote:
> > On Tue, Nov 16, 2021 at 10:05 AM John Stultz <john.stultz@...aro.org> wrote:
> > > On Mon, Nov 15, 2021 at 5:41 PM Huacai Chen <chenhuacai@...il.com> wrote:
> > > > On Tue, Nov 16, 2021 at 1:27 AM John Stultz <john.stultz@...aro.org> wrote:
> > > > > On Sat, Nov 13, 2021 at 11:47 PM Huacai Chen <chenhuacai@...ngson.cn> wrote:
> > > > > >
> > > > > > The current cyc_to_ns() implementation is like this:
> > > > > >
> > > > > > static inline u64 notrace cyc_to_ns(u64 cyc, u32 mult, u32 shift)
> > > > > > {
> > > > > >         return (cyc * mult) >> shift;
> > > > > > }
> > > > > >
> > > > > > But u64*u32 maybe overflow, so introduce ARCH_HAS_CYC_TO_NS to allow
> > > > > > architecture to override it.
> > > > > >
> > > > >
> > > > > If that's the case, it would seem too large a mult/shift pair had been selected.
> > > > We use a 100MHz clock and the counter is 64bit, the mult is ~160M. But
> > > > even if we use a smaller mult, cyc*mult, it can also overflow.
> > >
> > > Well, yes, any simple multiplication could overflow. My point is that
> > > the mult/shift pair is usually calculated for an expected interval
> > > range via clocks_calc_mult_shift(), where the max interval for
> > > sched_clock is set to an hour.  So any interval length under an hour
> > > should not overflow in a simple multiplication.
> > >
> > > What I'm trying to understand is what is the case you have where your
> > > interval length is longer than an hour?
> > > As that might break other assumptions going on in the code.
> > We found that the mult is "reasonable" if we use 100MHz, 50MHz or
> > 25MHz clocks, but if we use some others, such as 33MHz, the calculated
> > mult is dramatically large.
>
> So that sounds a little concerning, but in clocks_calc_mult_shift() we
> try to find an accurate mult/shift pair that fits the expected range.
> As we want to be as accurate as possible, we need higher shift values
> for freqs that don't evenly divide NSEC_PER_SEC, in order to minimize
> the error. So larger mult/shift values isn't necessarily a problem.
>
> For the set you gave, the code should give
> 100000000 -> m:20971520 s:21
> 50000000 -> m:41943040 s:21
> 25000000 -> m:83886080 s:21
> 33333333 -> m:125829121 s:22
>
> And yes, the 33Mhz does have a larger mult/shift pair, but that's
> because we want to be as accurate as possible in the conversion.
Maybe we have made some mistakes, let's investigate it deeper. Thanks.

Huacai

>
> And calculating that max interval multiplication (one hour of cycles):
> 100Mhz:     (180000000000cyc * 41943040 = 7549747200000000000) >> 21 =
> 3600000000000 ns
> 50Mhz:       (90000000000cyc * 83886080 = 7549747200000000000) >> 21 =
> 3600000000000 ns
> 25Mhz:       (360000000000cyc * 20971520 = 7549747200000000000) >> 21
> = 3600000000000 ns
> 33Mhz:       (119999998800cyc * 125829121 = 15099494369005054800) >>
> 22 = 3599999992610 ns
>
> So assuming you're seeing the same mult/shift from above, we should be
> able to handle an hour of cycles without overflowing the
> multiplication.
> (Though it definitely gets close with the 33Mhz case, as we would
> overflow with an hour + 7 seconds or so of cycles.)
>
> If you're seeing something else, let us know, as then it's probably a bug.
>
> But again, it would be good to understand the use case where the
> sched_clock epoch isn't being updated for over an hour.  I'm guessing
> hrtimers are being deferred for a super long time?
>
> > > > > What sort of cycle range are you considering to be valid here? Can you
> > > > > provide more rationale as to why this needs the ability to be
> > > > > overridden?
> > > > >
> > > > > And what sort of arch-specific logic do you envision, rather than
> > > > > having a common implementation to avoid the overflow?
> > > > u64*u64 can be handled by hardware (store the high bits and low bits
> > > > of result in two registers). So, if we use assembly, we can handle the
> > > > overflow correctly. E.g., LoongArch (and MIPS) can override
> > > > cyc_to_ns() like this:
> > > >
> > > > static inline u64 notrace cyc_to_ns(u64 cyc, u32 mult, u32 shift)
> > > > {
> > > >         u64 t1, t2, t3;
> > > >         unsigned long long rv;
> > > >
> > > >         /* 64-bit arithmetic can overflow, so use 128-bit. */
> > > >         __asm__ (
> > > >                 "nor            %[t1], $r0, %[shift]    \n\t"
> > > >                 "mulh.du        %[t2], %[cyc], %[mult]  \n\t"
> > > >                 "mul.d          %[t3], %[cyc], %[mult]  \n\t"
> > > >                 "slli.d         %[t2], %[t2], 1         \n\t"
> > > >                 "srl.d          %[rv], %[t3], %[shift]  \n\t"
> > > >                 "sll.d          %[t1], %[t2], %[t1]     \n\t"
> > > >                 "or             %[rv], %[t1], %[rv]     \n\t"
> > > >                 : [rv] "=&r" (rv), [t1] "=&r" (t1), [t2] "=&r" (t2),
> > > > [t3] "=&r" (t3)
> > > >                 : [cyc] "r" (cyc), [mult] "r" (mult), [shift] "r" (shift)
> > > >                 : );
> > > >         return rv;
> > > > }
> > >
> > > But then isn't the mul_u64_u32_shr() the right abstraction for such a
> > > custom implementation?
> > >
> > > Then potentially implement a generic cyc_to_ns() implementation that
> > > uses that instead?
> > OK, we will try to override mul_u64_u32_shr() rather than cyc_to_ns().
>
> If the use case is considered reasonable, there likely still needs to
> be a cyc_to_ns() implementation that uses mult_u64_u32_shr(), but
> hopefully it can be generic (maybe under a config option for the
> deferred hrtimer use case?).
>
> thanks
> -john