lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 1 Jun 2020 22:26:09 +0200
From:   "Michael Karcher" <michael.karcher@...berlin.de>
To:     "Rich Felker" <dalias@...c.org>
Cc:     "John Paul Adrian Glaubitz" <glaubitz@...sik.fu-berlin.de>,
        "Geert Uytterhoeven" <geert@...ux-m68k.org>,
        "Linux-sh list" <linux-sh@...r.kernel.org>,
        "Yoshinori Sato" <ysato@...rs.sourceforge.jp>,
        "Michael Karcher" <kernel@...rcher.dialup.fu-berlin.de>,
        "Linux Kernel Mailing List" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] sh: Implement __get_user_u64() required for 64-bit
 get_user()

Rich Felker schrieb:
>> >> Can I propose a different solution? For archs where there isn't
>> >> actually any 64-bit load or store instruction, does it make sense to
>> >> be writing asm just to do two 32-bit loads/stores, especially when
>> >> this code is not in a hot path?
>> > Yes, that's an option, too.
>> That's the solution that Michael Karcher suggested to me as an
>> alternative when I talked to him off-list.

There is a functional argument agains using get_user_32 twice, which I
overlooked in my private reply to Adrian. If any of the loads fail, we do
not only want err to be set to -EFAULT (which will happen), but we also
want a 64-bit zero as result. If one 32-bit read faults, but the other one
works, we would get -EFAULT together with 32 valid data bits, and 32 zero
bits.

I continue to elaborate on my performance remark, ignoring this functional
difference (which is a good reason to not just use two 32-bit accesses,
much more so than the performance "issue").

> I don't have an objection to doing it the way you've proposed, but I
> don't think there's any performance distinction or issue with the two
> invocations.

Assuming we don't need two exception table entries (put_user_64 currently
uses only one, maybe it's wrong), using put_user_32 twice creates an extra
unneeded exception table entry, which will "bloat" the exception table.
That table is most likely accessed by a binary search algorithm, so the
performance loss is marginal, though. Also a bigger table size is
cache-unfriendly. (Again, this is likely marginal again, as binary search
is already extremely cache-unfriendly).

A similar argument can be made for the exception handler. Even if we need
two entries in the exception table, so the first paragraph does not apply,
the two entries in the exception table can share the same exception
handler (clear the whole 64-bit destination to zero, set -EFAULT, jump
past both load instructions), so that part of (admittedly cold) kernel
code can get some instructios shorter.


Regards,
  Michael Karcher

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ