[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f9960727-d061-8856-45ce-4e33e8ed1de6@suse.cz>
Date: Fri, 11 Feb 2022 19:49:11 +0100
From: Vlastimil Babka <vbabka@...e.cz>
To: Hugh Dickins <hughd@...gle.com>,
Andrew Morton <akpm@...ux-foundation.org>
Cc: Michal Hocko <mhocko@...e.com>,
"Kirill A. Shutemov" <kirill@...temov.name>,
Matthew Wilcox <willy@...radead.org>,
David Hildenbrand <david@...hat.com>,
Alistair Popple <apopple@...dia.com>,
Johannes Weiner <hannes@...xchg.org>,
Rik van Riel <riel@...riel.com>,
Suren Baghdasaryan <surenb@...gle.com>,
Yu Zhao <yuzhao@...gle.com>, Greg Thelen <gthelen@...gle.com>,
Shakeel Butt <shakeelb@...gle.com>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH 11/13] mm/munlock: page migration needs mlock pagevec
drained
On 2/6/22 22:49, Hugh Dickins wrote:
> Page migration of a VM_LOCKED page tends to fail, because when the old
> page is unmapped, it is put on the mlock pagevec with raised refcount,
> which then fails the freeze.
>
> At first I thought this would be fixed by a local mlock_page_drain() at
> the upper rmap_walk() level - which would have nicely batched all the
> munlocks of that page; but tests show that the task can too easily move
> to another cpu, leaving pagevec residue behind which fails the migration.
>
> So try_to_migrate_one() drain the local pagevec after page_remove_rmap()
> from a VM_LOCKED vma; and do the same in try_to_unmap_one(), whose
> TTU_IGNORE_MLOCK users would want the same treatment; and do the same
> in remove_migration_pte() - not important when successfully inserting
> a new page, but necessary when hoping to retry after failure.
>
> Any new pagevec runs the risk of adding a new way of stranding, and we
> might discover other corners where mlock_page_drain() or lru_add_drain()
> would now help. If the mlock pagevec raises doubts, we can easily add a
> sysctl to tune its length to 1, which reverts to synchronous operation.
Not a fan of adding new sysctls like those as that just pushes the failure
of kernel devs to poor admins :)
The old pagevec usage deleted by patch 1 was limited to the naturally larger
munlock_vma_pages_range() operation. The new per-cpu based one is more
general, which obviously has its advantages, but then it might bring new
corner cases.
So if this turns out to be an big problem, I would rather go back to the
limited scenario pagevec than a sysctl?
> Signed-off-by: Hugh Dickins <hughd@...gle.com>
Acked-by: Vlastimil Babka <vbabka@...e.cz>
> ---
> mm/migrate.c | 2 ++
> mm/rmap.c | 4 ++++
> 2 files changed, 6 insertions(+)
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index f4bcf1541b62..e7d0b68d5dcb 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -251,6 +251,8 @@ static bool remove_migration_pte(struct page *page, struct vm_area_struct *vma,
> page_add_file_rmap(new, vma, false);
> set_pte_at(vma->vm_mm, pvmw.address, pvmw.pte, pte);
> }
> + if (vma->vm_flags & VM_LOCKED)
> + mlock_page_drain(smp_processor_id());
>
> /* No need to invalidate - it was non-present before */
> update_mmu_cache(vma, pvmw.address, pvmw.pte);
> diff --git a/mm/rmap.c b/mm/rmap.c
> index 5442a5c97a85..714bfdc72c7b 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -1656,6 +1656,8 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
> * See Documentation/vm/mmu_notifier.rst
> */
> page_remove_rmap(subpage, vma, PageHuge(page));
> + if (vma->vm_flags & VM_LOCKED)
> + mlock_page_drain(smp_processor_id());
> put_page(page);
> }
>
> @@ -1930,6 +1932,8 @@ static bool try_to_migrate_one(struct page *page, struct vm_area_struct *vma,
> * See Documentation/vm/mmu_notifier.rst
> */
> page_remove_rmap(subpage, vma, PageHuge(page));
> + if (vma->vm_flags & VM_LOCKED)
> + mlock_page_drain(smp_processor_id());
> put_page(page);
> }
>
Powered by blists - more mailing lists