lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sun, 14 Jun 2020 18:51:43 +0530
From:   afzal mohammed <afzal.mohd.ma@...il.com>
To:     Arnd Bergmann <arnd@...db.de>
Cc:     Russell King - ARM Linux admin <linux@...linux.org.uk>,
        Linus Walleij <linus.walleij@...aro.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Linux-MM <linux-mm@...ck.org>,
        Linux ARM <linux-arm-kernel@...ts.infradead.org>,
        Nicolas Pitre <nico@...xnic.net>,
        Catalin Marinas <catalin.marinas@....com>,
        Will Deacon <will@...nel.org>,
        Matthew Wilcox <willy@...radead.org>,
        Al Viro <viro@...iv.linux.org.uk>
Subject: Re: [RFC 1/3] lib: copy_{from,to}_user using gup & kmap_atomic()

Hi,

On Sat, Jun 13, 2020 at 10:45:33PM +0200, Arnd Bergmann wrote:

> 4% boot time increase sounds like a lot, especially if that is only for
> copy_from_user/copy_to_user. In the end it really depends on how well
> get_user()/put_user() and small copies can be optimized in the end.

i mentioned the worst case(happened only once), normally it was in
the range 2-3%

> From the numbers you
> measured, it seems the beaglebone currently needs an extra ~6µs or
> 3µs per copy_to/from_user() call with your patch, depending on what
> your benchmark was (MB/s for just reading or writing vs MB/s for
> copying from one file to another through a user space buffer).

It is MB/s for copying one file to another via user space buffer, i.e.
the value coreutils 'dd' shows w/ status=progress (here it was busybox
'dd', so instead it was enabling a compile time option)

> but if you want to test what the overhead is, you could try changing
> /dev/zero (or a different chardev like it) to use a series of
> put_user(0, u32uptr++) in place of whatever it has, and then replace the
> 'str' instruction with dummy writes to ttbr0 using the value it already
> has, like:
> 
>       mcr     p15, 0, %0, c2, c0, 0  /* set_ttbr0() */
>       isb  /* prevent speculative access to kernel table */
>       str    %1, [%2],0 /* write 32 bit to user space */
>       mcr     p15, 0, %0, c2, c0, 0  /* set_ttbr0() */
>       isb  /* prevent speculative access to user table */

> It would be interesting to compare it to the overhead of a
> get_user_page_fast() based implementation.

i have to relocate & be on quarantine couple of weeks, so i will
temporarily stop here, otherwise might end up in roadside.

Reading feedbacks from everyone, some of it i could grasp only bits &
pieces, familiarizing more w/ mm & vfs would help me add value better
to the goal/discussion. Linus Walleij, if you wish to explore things,
feel free, right now don't know how my connectivity would be for next
3 weeks.

Regards
afzal

Powered by blists - more mailing lists