lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 6 Jun 2017 11:31:02 +0200
From:   Arnd Bergmann <arnd@...db.de>
To:     Palmer Dabbelt <palmer@...belt.com>
Cc:     Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Olof Johansson <olof@...om.net>, albert@...ive.com
Subject: Re: [PATCH 5/7] RISC-V: arch/riscv/lib

On Tue, Jun 6, 2017 at 6:56 AM, Palmer Dabbelt <palmer@...belt.com> wrote:
> On Fri, 26 May 2017 02:06:58 PDT (-0700), Arnd Bergmann wrote:
>> On Thu, May 25, 2017 at 3:59 AM, Palmer Dabbelt <palmer@...belt.com> wrote:
>>> On Tue, 23 May 2017 04:19:42 PDT (-0700), Arnd Bergmann wrote:
>>>> On Tue, May 23, 2017 at 2:41 AM, Palmer Dabbelt <palmer@...belt.com> wrote:
>>
>>>> Also, it would be good to replace the multiply+div64
>>>> with a single multiplication here, see how x86 and arm do it
>>>> (for the tsc/__timer_delay case).
>>>
>>> Makes sense.  I think this should do it
>>>
>>>   https://github.com/riscv/riscv-linux/commit/d397332f6ebff42f3ecb385e9cf3284fdeda6776
>>>
>>> but I'm finding this hard to test as this only works for 2ms sleeps.  It seems
>>> at least in the right ballpark
>>
>> + if (usecs > MAX_UDELAY_US) {
>> + __delay((u64)usecs * riscv_timebase / 1000000ULL);
>> + return;
>> + }
>>
>> You still do the 64-bit division here. What I meant is to completely
>> avoid the division and use a multiply+shift.
>
> The goal here was to avoid the error case that ARM has on overflow and instead
> just delay for the requested time.  This should only divide when the delay is
>>=2ms, so the division won't cost much in comparison.
>
> The normal case should have no division in it.
>
> I can copy ARM's error handling if you think that's better, but it seemed more
> complicated than just computing the correct answer.

I think the intention originally was to avoid overflowing the 32-bit
argument in

 void __delay(unsigned long cycles)

If you need to delay for more than 4 billion clocksource cycles,
your code is still broken.

>> Also, you don't need to base anything on HZ, as you do not rely
>> on the delay calibration but always use a timer.
>
> That makes sense, I just based this blindly off the ARM version.  I'll see if
> that lets me avoid unnecessary overflow for ndelay.  If it doesn't then I'd
> prefer to just keep exactly the same constraints ARM has to avoid unexpected
> behavior.

Right, I should have been more specific here, as ARM has two implementations
(loop and timer) and it gets much easier if you know you can rely on the
timer to be available.

       Arnd

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ