lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20170524151340.GG3063@rapoport-lnx>
Date:   Wed, 24 May 2017 18:13:41 +0300
From:   Mike Rapoport <rppt@...ux.vnet.ibm.com>
To:     Vlastimil Babka <vbabka@...e.cz>
Cc:     Pavel Emelyanov <xemul@...tuozzo.com>,
        Michal Hocko <mhocko@...nel.org>,
        "Kirill A. Shutemov" <kirill@...temov.name>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Arnd Bergmann <arnd@...db.de>,
        "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
        Andrea Arcangeli <aarcange@...hat.com>,
        linux-mm <linux-mm@...ck.org>,
        lkml <linux-kernel@...r.kernel.org>,
        Linux API <linux-api@...r.kernel.org>
Subject: Re: [PATCH] mm: introduce MADV_CLR_HUGEPAGE

On Wed, May 24, 2017 at 04:54:38PM +0200, Vlastimil Babka wrote:
> On 05/24/2017 04:28 PM, Pavel Emelyanov wrote:
> > On 05/24/2017 02:31 PM, Vlastimil Babka wrote:
> >> On 05/24/2017 12:39 PM, Mike Rapoport wrote:
> >>>> Hm so the prctl does:
> >>>>
> >>>>                 if (arg2)
> >>>>                         me->mm->def_flags |= VM_NOHUGEPAGE;
> >>>>                 else
> >>>>                         me->mm->def_flags &= ~VM_NOHUGEPAGE;
> >>>>
> >>>> That's rather lazy implementation IMHO. Could we change it so the flag
> >>>> is stored elsewhere in the mm, and the code that decides to (not) use
> >>>> THP will check both the per-vma flag and the per-mm flag?
> >>>
> >>> I afraid I don't understand how that can help.
> >>> What we need is an ability to temporarily disable collapse of the pages in
> >>> VMAs that do not have VM_*HUGEPAGE flags set and that after we re-enable
> >>> THP, the vma->vm_flags for those VMAs will remain intact.
> >>
> >> That's what I'm saying - instead of implementing the prctl flag via
> >> mm->def_flags (which gets permanently propagated to newly created vma's
> >> but e.g. doesn't affect already existing ones), it would be setting a
> >> flag somewhere in mm, which khugepaged (and page faults) would check in
> >> addition to the per-vma flags.
> > 
> > I do not insist, but this would make existing paths (checking for flags) be 
> > 2 times slower -- from now on these would need to check two bits (vma flags
> > and mm flags) which are 100% in different cache lines.
> 
> I'd expect you already have mm struct cached during a page fault. And
> THP-eligible page fault is just one per pmd, the overhead should be
> practically zero.
> 
> > What Mike is proposing is the way to fine-tune the existing vma flags. This
> > would keep current paths as fast (or slow ;) ) as they are now. All the
> > complexity would go to rare cases when someone needs to turn thp off for a
> > while and then turn it back on.
> 
> Yeah but it's extending user-space API for a corner case. We should do
> that only when there's no other option.

With madvise() I'm suggesting we rather add "completeness" to the existing
API, IMHO. We do have API that sets VM_HUGEPAGE and clears VM_NOHUGEPAGE or
vise versa, but we do not have an API that can clear both flags...

And if we would use prctl(), we either change user visible behaviour or we
still need to extend the API and use, say, arg2 to distinguish between the
current behaviour and the new one.

--
Sincerely yours,
Mike. 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ