linux-kernel - Re: [PATCH] time/sched_clock: Allow architecture to override cyc_to

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAAhV-H6Bq63uM-ifkM8KDJGD1uavv42bG9ij_CZBbCpC-AFSjg@mail.gmail.com>
Date:   Tue, 16 Nov 2021 09:41:05 +0800
From:   Huacai Chen <chenhuacai@...il.com>
To:     John Stultz <john.stultz@...aro.org>
Cc:     Huacai Chen <chenhuacai@...ngson.cn>,
        Thomas Gleixner <tglx@...utronix.de>,
        Stephen Boyd <sboyd@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Xuefeng Li <lixuefeng@...ngson.cn>,
        Jiaxun Yang <jiaxun.yang@...goat.com>
Subject: Re: [PATCH] time/sched_clock: Allow architecture to override cyc_to_ns()

Hi, John,

On Tue, Nov 16, 2021 at 1:27 AM John Stultz <john.stultz@...aro.org> wrote:
>
> On Sat, Nov 13, 2021 at 11:47 PM Huacai Chen <chenhuacai@...ngson.cn> wrote:
> >
> > The current cyc_to_ns() implementation is like this:
> >
> > static inline u64 notrace cyc_to_ns(u64 cyc, u32 mult, u32 shift)
> > {
> >         return (cyc * mult) >> shift;
> > }
> >
> > But u64*u32 maybe overflow, so introduce ARCH_HAS_CYC_TO_NS to allow
> > architecture to override it.
> >
>
> If that's the case, it would seem too large a mult/shift pair had been selected.
We use a 100MHz clock and the counter is 64bit, the mult is ~160M. But
even if we use a smaller mult, cyc*mult, it can also overflow.

>
> What sort of cycle range are you considering to be valid here? Can you
> provide more rationale as to why this needs the ability to be
> overridden?
>
> And what sort of arch-specific logic do you envision, rather than
> having a common implementation to avoid the overflow?
u64*u64 can be handled by hardware (store the high bits and low bits
of result in two registers). So, if we use assembly, we can handle the
overflow correctly. E.g., LoongArch (and MIPS) can override
cyc_to_ns() like this:

static inline u64 notrace cyc_to_ns(u64 cyc, u32 mult, u32 shift)
{
        u64 t1, t2, t3;
        unsigned long long rv;

        /* 64-bit arithmetic can overflow, so use 128-bit. */
        __asm__ (
                "nor            %[t1], $r0, %[shift]    \n\t"
                "mulh.du        %[t2], %[cyc], %[mult]  \n\t"
                "mul.d          %[t3], %[cyc], %[mult]  \n\t"
                "slli.d         %[t2], %[t2], 1         \n\t"
                "srl.d          %[rv], %[t3], %[shift]  \n\t"
                "sll.d          %[t1], %[t2], %[t1]     \n\t"
                "or             %[rv], %[t1], %[rv]     \n\t"
                : [rv] "=&r" (rv), [t1] "=&r" (t1), [t2] "=&r" (t2),
[t3] "=&r" (t3)
                : [cyc] "r" (cyc), [mult] "r" (mult), [shift] "r" (shift)
                : );
        return rv;
}

Huacai
>
> thanks
> -john