[<prev] [next>] [day] [month] [year] [list]
Message-ID: <47f6d39c-a075-50cb-1cfb-26dd957a48af@google.com>
Date: Mon, 28 Mar 2022 14:44:11 -0700 (PDT)
From: Hugh Dickins <hughd@...gle.com>
To: Andrew Morton <akpm@...ux-foundation.org>
cc: Mike Galbraith <efault@....de>, Vlastimil Babka <vbabka@...e.cz>,
Hugh Dickins <hughd@...gle.com>, linux-kernel@...r.kernel.org,
linux-mm@...ck.org
Subject: [PATCH] mm/munlock: add lru_add_drain() to fix memcg_stat_test
Mike reports that LTP memcg_stat_test usually leads to...
memcg_stat_test 3 TINFO: Test unevictable with MAP_LOCKED
memcg_stat_test 3 TINFO: Running memcg_process --mmap-lock1 -s 135168
memcg_stat_test 3 TINFO: Warming up pid: 3460
memcg_stat_test 3 TINFO: Process is still here after warm up: 3460
memcg_stat_test 3 TFAIL: unevictable is 122880, 135168 expected
...but may lead to...
memcg_stat_test 4 TINFO: Test unevictable with mlock
memcg_stat_test 4 TINFO: Running memcg_process --mmap-lock2 -s 135168
memcg_stat_test 4 TINFO: Warming up pid: 4271
memcg_stat_test 4 TINFO: Process is still here after warm up: 4271
memcg_stat_test 4 TFAIL: unevictable is 122880, 135168 expected
...or both. A wee bit flaky.
follow_page_pte() used to have an lru_add_drain() per each page mlocked,
and the test came to rely on accurate stats. The pagevec to be drained
is different now, but still covered by lru_add_drain(); and, never mind
the test, I believe it's in everyone's interest that a bulk faulting
interface like populate_vma_page_range() or faultin_vma_page_range()
should drain its local pagevecs at the end, to save others sometimes
needing the much more expensive lru_add_drain_all(). This does not
absolutely guarantee exact stats - the mlocking task can be migrated
between CPUs as it proceeds - but it's good enough and the tests pass.
Reported-by: Mike Galbraith <efault@....de>
Fixes: b67bf49ce7aa ("mm/munlock: delete FOLL_MLOCK and FOLL_POPULATE")
Signed-off-by: Hugh Dickins <hughd@...gle.com>
---
One can argue about the Fixes tag - I'd have chosen the pagevec commit,
but this is the one Mike identified, and the one which touches mm/gup.c,
so I'm happy to go with this - and doesn't matter so long as 5.18 gets it.
mm/gup.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
--- 5.18-pre/mm/gup.c
+++ linux/mm/gup.c
@@ -1404,6 +1404,7 @@ long populate_vma_page_range(struct vm_a
struct mm_struct *mm = vma->vm_mm;
unsigned long nr_pages = (end - start) / PAGE_SIZE;
int gup_flags;
+ long ret;
VM_BUG_ON(!PAGE_ALIGNED(start));
VM_BUG_ON(!PAGE_ALIGNED(end));
@@ -1438,8 +1439,10 @@ long populate_vma_page_range(struct vm_a
* We made sure addr is within a VMA, so the following will
* not result in a stack expansion that recurses back here.
*/
- return __get_user_pages(mm, start, nr_pages, gup_flags,
+ ret = __get_user_pages(mm, start, nr_pages, gup_flags,
NULL, NULL, locked);
+ lru_add_drain();
+ return ret;
}
/*
@@ -1471,6 +1474,7 @@ long faultin_vma_page_range(struct vm_ar
struct mm_struct *mm = vma->vm_mm;
unsigned long nr_pages = (end - start) / PAGE_SIZE;
int gup_flags;
+ long ret;
VM_BUG_ON(!PAGE_ALIGNED(start));
VM_BUG_ON(!PAGE_ALIGNED(end));
@@ -1498,8 +1502,10 @@ long faultin_vma_page_range(struct vm_ar
if (check_vma_flags(vma, gup_flags))
return -EINVAL;
- return __get_user_pages(mm, start, nr_pages, gup_flags,
+ ret = __get_user_pages(mm, start, nr_pages, gup_flags,
NULL, NULL, locked);
+ lru_add_drain();
+ return ret;
}
/*
Powered by blists - more mailing lists