[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a650e078-25cb-14f8-4880-e575c937847@google.com>
Date: Wed, 9 Feb 2022 14:59:19 -0800 (PST)
From: Hugh Dickins <hughd@...gle.com>
To: Michal Hocko <mhocko@...e.com>
cc: Hugh Dickins <hughd@...gle.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Vlastimil Babka <vbabka@...e.cz>,
"Kirill A. Shutemov" <kirill@...temov.name>,
Matthew Wilcox <willy@...radead.org>,
David Hildenbrand <david@...hat.com>,
Alistair Popple <apopple@...dia.com>,
Johannes Weiner <hannes@...xchg.org>,
Rik van Riel <riel@...riel.com>,
Suren Baghdasaryan <surenb@...gle.com>,
Yu Zhao <yuzhao@...gle.com>, Greg Thelen <gthelen@...gle.com>,
Shakeel Butt <shakeelb@...gle.com>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH 00/13] mm/munlock: rework of mlock+munlock page
handling
On Wed, 9 Feb 2022, Michal Hocko wrote:
> On Wed 09-02-22 08:21:17, Hugh Dickins wrote:
> > On Wed, 9 Feb 2022, Michal Hocko wrote:
> [...]
> > > The only thing that is not entirely clear to me at the moment is why you
> > > have chosen to ignore already mapped LOCKONFAULT pages. They will
> > > eventually get sorted out during the reclaim/migration but this can
> > > backfire if too many pages have been pre-faulted before LOCKONFAULT
> > > call. Maybe not an interesting case in the first place but I am still
> > > wondering why you have chosen that way.
> >
> > I'm puzzled: what makes you think I'm ignoring already mapped LOCKONFAULT
> > pages? I'd consider that a bug.
>
> I've had the following path in mind
> __mm_populate
> populate_vma_page_range
> if (vma->vm_flags & VM_LOCKONFAULT)
> return nr_page
>
> which means that __get_user_pages is not called at all. This also means
> that follow_page_mask is skipped. The page table walk used to mark
> already mapped pages as mlocked so unless I miss some other path it
> would stay on its original LRU list and only get real mlock protection
> when encountered by the reclaim or migration.
That is so until 04/13: but then, whenever a page is faulted into a
VM_LOCKED area (except when skipping pte-of-compound) it goes through
mlock_page(). So it will then lock-on-fault.
Ah, but you're worrying about any previously-mapped pages when
VM_LOCKONFAULT is established: those ones get done by mlock_pte_range()
in 07/13, same as when VM_LOCKED is established - I checked again,
VM_LOCKONFAULT is not set without VM_LOCKED.
>
> > It is the case, isn't it, that a VM_LOCKONFAULT area always has VM_LOCKED
> > set too? If I've got that wrong, yes, I'll need to revisit conditions.
>
> Yes, I did't really mean we would lose the mlock protection. We would
> just do the lazy mlock also on pages which are already mapped. This is
> certainly not a correctness issue - althoug stats might be off - but it
> could polute existing LRUs with mlocked pages rather easily.
Even with the lazy mlock in reclaim, I'd still accuse myself of a bug
if I've got this wrong.
>
> As I've said. Either I am still missing something or this might even not
> be a big deal in real life. I was mostly curious about the choice to
> exclude the page table walk for LOCKONFAULT.
It surprised me too: but looking at how FOLL_POPULATE was handled in
faultin_page(), it appeared to me to become redundant, once mlocking
was counted at fault time (and in mlock_pte_range() - maybe that's
the page table walk you're looking for, not excluded but moved).
It is a big deal for me: please don't let me fool you, please do
continue to worry at it until you're convinced or you convince me.
Thanks,
Hugh
Powered by blists - more mailing lists