[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2ec49f65-fe4e-26a0-4059-c18e6dab0af4@suse.cz>
Date: Fri, 11 Feb 2022 17:45:26 +0100
From: Vlastimil Babka <vbabka@...e.cz>
To: Hugh Dickins <hughd@...gle.com>,
Andrew Morton <akpm@...ux-foundation.org>
Cc: Michal Hocko <mhocko@...e.com>,
"Kirill A. Shutemov" <kirill@...temov.name>,
Matthew Wilcox <willy@...radead.org>,
David Hildenbrand <david@...hat.com>,
Alistair Popple <apopple@...dia.com>,
Johannes Weiner <hannes@...xchg.org>,
Rik van Riel <riel@...riel.com>,
Suren Baghdasaryan <surenb@...gle.com>,
Yu Zhao <yuzhao@...gle.com>, Greg Thelen <gthelen@...gle.com>,
Shakeel Butt <shakeelb@...gle.com>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH 07/13] mm/munlock: mlock_pte_range() when mlocking or
munlocking
On 2/6/22 22:42, Hugh Dickins wrote:
> Fill in missing pieces: reimplementation of munlock_vma_pages_range(),
> required to lower the mlock_counts when munlocking without munmapping;
> and its complement, implementation of mlock_vma_pages_range(), required
> to raise the mlock_counts on pages already there when a range is mlocked.
>
> Combine them into just the one function mlock_vma_pages_range(), using
> walk_page_range() to run mlock_pte_range(). This approach fixes the
> "Very slow unlockall()" of unpopulated PROT_NONE areas, reported in
> https://lore.kernel.org/linux-mm/70885d37-62b7-748b-29df-9e94f3291736@gmail.com/
>
> Munlock clears VM_LOCKED at the start, under exclusive mmap_lock; but if
> a racing truncate or holepunch (depending on i_mmap_rwsem) gets to the
> pte first, it will not try to munlock the page: leaving release_pages()
> to correct it when the last reference to the page is gone - that's okay,
> a page is not evictable anyway while it is held by an extra reference.
>
> Mlock sets VM_LOCKED at the start, under exclusive mmap_lock; but if
> a racing remove_migration_pte() or try_to_unmap_one() (depending on
> i_mmap_rwsem) gets to the pte first, it will try to mlock the page,
> then mlock_pte_range() mlock it a second time. This is harder to
> reproduce, but a more serious race because it could leave the page
> unevictable indefinitely though the area is munlocked afterwards.
> Guard against it by setting the (inappropriate) VM_IO flag,
> and modifying mlock_vma_page() to decline such vmas.
>
> Signed-off-by: Hugh Dickins <hughd@...gle.com>
Acked-by: Vlastimil Babka <vbabka@...e.cz>
> @@ -162,8 +230,7 @@ static int mlock_fixup(struct vm_area_struct *vma, struct vm_area_struct **prev,
> pgoff_t pgoff;
> int nr_pages;
> int ret = 0;
> - int lock = !!(newflags & VM_LOCKED);
> - vm_flags_t old_flags = vma->vm_flags;
> + vm_flags_t oldflags = vma->vm_flags;
>
> if (newflags == vma->vm_flags || (vma->vm_flags & VM_SPECIAL) ||
Nit: can use oldflags instead of vma->vm_flags above?
Powered by blists - more mailing lists