lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 16 Apr 2020 12:35:14 +0900
From:   Jaewon Kim <jaewon31.kim@...sung.com>
To:     minchan@...nel.org, mgorman@...e.de, m.szyprowski@...sung.com,
        mina86@...a86.com, riel@...hat.com, akpm@...ux-foundation.org
Cc:     linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        jaewon31.kim@...il.com, ytk.lee@...sung.com,
        Jaewon Kim <jaewon31.kim@...sung.com>
Subject: [PATCH] mm/vmscan: skip layzfree page on
 reclaim_clean_pages_from_list

This patch fix nr_isolate_* mismatch problem between cma and dirty
lazyfree page.

If try_to_unmap_one is used for reclaim and it detects a dirty lazyfree
page, then the lazyfree page is changed to a normal anon page having
SwapBacked by commit 18863d3a3f59 ("mm: remove SWAP_DIRTY in ttu"). Even
with the change, reclaim context correctly counts isolated files because
it uses is_file_lru to distinguish file. And the change to anon is not
happened if try_to_unmap_one is used for migration. So migration context
like compaction also correctly counts isolated files even though it uses
page_is_file_lru insted of is_file_lru. Recently page_is_file_cache was
renamed to page_is_file_lru by commit 9de4f22a60f7 ("mm: code cleanup for
MADV_FREE").

But the nr_isolate_* mismatch problem happens on cma alloc. There is
reclaim_clean_pages_from_list which is being used only by cma. It was
introduced by commit 02c6de8d757c ("mm: cma: discard clean pages during
contiguous allocation instead of migration") to reclaim clean file pages
without migration. The cma alloc uses both reclaim_clean_pages_from_list
and migrate_pages, and it uses page_is_file_lru to count isolated
files. If there are dirty lazyfree pages allocated from cma memory
region, the pages are counted as isolated file at the beginging but are
counted as isolated anon after finished.

Mem-Info:
Node 0 active_anon:3045904kB inactive_anon:611448kB active_file:14892kB inactive_file:205636kB unevictable:10416kB isolated(anon):0kB isolated(file):37664kB mapped:630216kB dirty:384kB writeback:0kB shmem:42576kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no

Like log above, there was too much isolated file, 37664kB, which
triggers too_many_isolated in reclaim when there is no isolated file in
system wide. It could be reproducible by running two programs, doing
MADV_FREE, writing and doing cma alloc, respectively. Although isolated
anon is 0, I found that the internal value of isolated anon was the
negative value of isolated file.

Fix this by skipping anon pages on reclaim_clean_pages_from_list. The
lazyfree page can be checked by both PageAnon(page) and
page_is_file_lru. But in this case, PageAnon is enough to skip all
anon pages.

Reported-by: Yong-Taek Lee <ytk.lee@...sung.com>
Signed-off-by: Jaewon Kim <jaewon31.kim@...sung.com>
---
 mm/vmscan.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index b06868fc4926..9380a18eef5e 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1497,6 +1497,9 @@ unsigned long reclaim_clean_pages_from_list(struct zone *zone,
 	LIST_HEAD(clean_pages);
 
 	list_for_each_entry_safe(page, next, page_list, lru) {
+		/* to avoid race with MADV_FREE anon page */
+		if (PageAnon(page))
+			continue;
 		if (page_is_file_lru(page) && !PageDirty(page) &&
 		    !__PageMovable(page) && !PageUnevictable(page)) {
 			ClearPageActive(page);
-- 
2.13.7

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ