lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 10 Oct 2018 14:19:24 -0700 (PDT)
From:   David Rientjes <rientjes@...gle.com>
To:     Andrea Arcangeli <aarcange@...hat.com>
cc:     Michal Hocko <mhocko@...nel.org>, Mel Gorman <mgorman@...e.de>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Vlastimil Babka <vbabka@...e.cz>,
        Andrea Argangeli <andrea@...nel.org>,
        Zi Yan <zi.yan@...rutgers.edu>,
        Stefan Priebe - Profihost AG <s.priebe@...fihost.ag>,
        "Kirill A. Shutemov" <kirill@...temov.name>, linux-mm@...ck.org,
        LKML <linux-kernel@...r.kernel.org>,
        Stable tree <stable@...r.kernel.org>
Subject: Re: [PATCH 1/2] mm: thp:  relax __GFP_THISNODE for MADV_HUGEPAGE
 mappings

On Tue, 9 Oct 2018, Andrea Arcangeli wrote:

> I think "madvise vs mbind" is more an issue of "no-permission vs
> permission" required. And if the processes ends up swapping out all
> other process with their memory already allocated in the node, I think
> some permission is correct to be required, in which case an mbind
> looks a better fit. MPOL_PREFERRED also looks a first candidate for
> investigation as it's already not black and white and allows spillover
> and may already do the right thing in fact if set on top of
> MADV_HUGEPAGE.
> 

We would never want to thrash the local node for hugepages because there 
is no guarantee that any swapping is useful.  On COMPACT_SKIPPED due to 
low memory, we have very clear evidence that pageblocks are already 
sufficiently fragmented by unmovable pages such that compaction itself, 
even with abundant free memory, fails to free an entire pageblock due to 
the allocator's preference to fragment pageblocks of fallback migratetypes 
over returning remote free memory.

As I've stated, we do not want to reclaim pointlessly when compaction is 
unable to access the freed memory or there is no guarantee it can free an 
entire pageblock.  Doing so allows thrashing of the local node, or remote 
nodes if __GFP_THISNODE is removed, and the hugepage still cannot be 
allocated.  If this proposed mbind() that requires permissions is geared 
to me as the user, I'm afraid the details of what leads to the thrashing 
are not well understood because I certainly would never use this.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ