lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 15 Jun 2020 09:32:41 +0200
From:   Jann Horn <jannh@...gle.com>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     kernel test robot <rong.a.chen@...el.com>,
        Christoph Hellwig <hch@....de>,
        Oleg Nesterov <oleg@...hat.com>,
        Kirill Shutemov <kirill@...temov.name>,
        Jan Kara <jack@...e.cz>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Matthew Wilcox <willy@...radead.org>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org
Subject: Re: [gup] 17839856fd: stress-ng.vm-splice.ops_per_sec 2158.6% improvement

On Thu, Jun 11, 2020 at 10:24 PM Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
>
> On Wed, Jun 10, 2020 at 9:05 PM kernel test robot <rong.a.chen@...el.com> wrote:
> >
> > FYI, we noticed a 2158.6% improvement of stress-ng.vm-splice.ops_per_sec due to commit:
> >
> > commit: 17839856fd588f4ab6b789f482ed3ffd7c403e1f ("gup: document and work around "COW can break either way" issue")
>
> Well, that is amusing, and seeing improvements is always nice, but
> somehow I think the test is broken.
>
> I can't see why you'd ever see an improvement from that commit, and if
> you do see one, not one by a factor of 20x.

FWIW, if this is the testcase:
<https://kernel.ubuntu.com/git/cking/stress-ng.git/tree/stress-vm-splice.c>

then that testcase is essentially testing how fast vmsplice() is when
called in a loop on an uninitialized mmap() mapping. So before that
commit, I think it will create zeropage PTEs in the first iteration
(and zeropage PTEs are _PAGE_SPECIAL, see do_anonymous_page()). And
get_user_pages_fast() bails out in gup_pte_range() if pte_special().
So that testcase was always hitting the GUP slowpath.
But now the first iteration will force the creation of a normal RW
PTE, so all following iterations can go through the GUP fastpath.

So in summary I guess the test was just really slow up until now
because it was hitting a slowpath that you wouldn't hit during normal
usage? At least for vmsplice(), writing uninitialized pages doesn't
really make a whole lot of sense...

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ