lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Ydd6IRTxI5RU/Sp1@google.com>
Date:   Thu, 6 Jan 2022 15:24:17 -0800
From:   Minchan Kim <minchan@...nel.org>
To:     John Hubbard <jhubbard@...dia.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Michal Hocko <mhocko@...e.com>,
        David Hildenbrand <david@...hat.com>,
        linux-mm <linux-mm@...ck.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Suren Baghdasaryan <surenb@...gle.com>,
        John Dias <joaodias@...gle.com>, huww98@...look.com
Subject: Re: [RFC v2] mm: introduce page pin owner

On Thu, Jan 06, 2022 at 02:27:48PM -0800, John Hubbard wrote:
> On 12/28/21 09:59, Minchan Kim wrote:
> > A Contiguous Memory Allocator(CMA) allocation can fail if any page
> > within the requested range has an elevated refcount(a pinned page).
> > 
> > Debugging such failures is difficult, because the struct pages only
> > show a combined refcount, and do not show the callstacks or
> > backtraces of the code that acquired each refcount. So the source
> > of the page pins remains a mystery, at the time of CMA failure.
> > 
> > In order to solve this without adding too much overhead, just do
> > nothing most of the time, which is pretty low overhead. However,
> > once a CMA failure occurs, then mark the page (this requires a
> > pointer's worth of space in struct page, but it uses page extensions
> > to get that), and start tracing the subsequent put_page() calls.
> > As the program finishes up, each page pin will be undone, and
> > traced with a backtrace. The programmer reads the trace output and
> > sees the list of all page pinning code paths.
> > 
> > This will consume an additional 8 bytes per 4KB page, or an
> > additional 0.2% of RAM. In addition to the storage space, it will
> > have some performance cost, due to increasing the size of struct
> > page so that it is greater than the cacheline size (or multiples
> > thereof) of popular (x86, ...) CPUs.
> > 
> > The idea can apply every user of migrate_pages as well as CMA to
> > know the reason why the page migration failed. To support it,
> > the implementation takes "enum migrate_reason" string as filter
> > of the tracepoint(see below).
> > 
> 
> Hi Minchan,
> 
> If this is ready to propose, then maybe it's time to remove the "RFC"
> qualification from the subject line, and re-post for final review.
> 
> And also when you do that, could you please specify which tree or commit
> this applies to? I wasn't able to figure that out this time.

Sorry for that. It was based on next-20211224.

> 
> > Usage)
> 
> This extensive "usage" section is probably helpful, but the commit
> log is certainly not the place for the "how to" documentation. Let's
> find an .rst file to stash it in, I think.

I wanted to get some review for implementation/interface/usage before
respin removing the RFC. Otherwise, the the documentation need to keep
update heavily. Based on your comment, I think you are almost agree
with as-is. Then, yeah, let me cook up the doc and repost it with
removing the RFC tag.

Thanks.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ