[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20191106060930.2571389-2-songliubraving@fb.com>
Date: Tue, 5 Nov 2019 22:09:29 -0800
From: Song Liu <songliubraving@...com>
To: <linux-kernel@...r.kernel.org>, <linux-mm@...ck.org>,
<akpm@...ux-foundation.org>
CC: <matthew.wilcox@...cle.com>, <kernel-team@...com>,
<william.kucharski@...cle.com>, <kirill.shutemov@...ux.intel.com>,
Song Liu <songliubraving@...com>,
Johannes Weiner <hannes@...xchg.org>,
Hugh Dickins <hughd@...gle.com>
Subject: [PATCH v5 1/2] mm,thp: recheck each page before collapsing file THP
In collapse_file(), for !is_shmem case, current check cannot guarantee
the locked page is up-to-date. Specifically, xas_unlock_irq() should
not be called before lock_page() and get_page(); and it is necessary to
recheck PageUptodate() after locking the page.
With this bug and CONFIG_READ_ONLY_THP_FOR_FS=y, madvise(HUGE)'ed .text
may contain corrupted data. This is because khugepaged mistakenly
collapses some not up-to-date sub pages into a huge page, and assumes
the huge page is up-to-date. This will NOT corrupt data in the disk,
because the page is read-only and never written back. Fix this by
properly checking PageUptodate() after locking the page. This check
replaces "VM_BUG_ON_PAGE(!PageUptodate(page), page);".
Also, move PageDirty() check after locking the page. Current
khugepaged should not try to collapse dirty file THP, because it is
limited to read-only .text. The only case we hit a dirty page here is
when the page hasn't been written since write. Bail out and retry when
this happens.
syzbot reported bug on previous version of this patch.
Fixes: 99cb0dbd47a1 ("mm,thp: add read-only THP support for (non-shmem) FS")
Reported-by: syzbot+efb9e48b9fbdc49bb34a@...kaller.appspotmail.com
Cc: Johannes Weiner <hannes@...xchg.org>
Cc: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>
Cc: Hugh Dickins <hughd@...gle.com>
Cc: William Kucharski <william.kucharski@...cle.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>
Signed-off-by: Song Liu <songliubraving@...com>
---
[songliubraving@...com: v4]
Link: http://lkml.kernel.org/r/20191022191006.411277-1-songliubraving@fb.com
[songliubraving@...com: fix deadlock in collapse_file()]
Link: http://lkml.kernel.org/r/20191028221414.3685035-1-songliubraving@fb.com
Link: http://lkml.kernel.org/r/20191018180345.4188310-1-songliubraving@fb.com
[songliubraving@...com: flush file for !is_shmem PageDirty() case in collapse_file()]
https://lkml.kernel.org/linux-mm/20191030200736.3455046-1-songliubraving@fb.com/
---
mm/khugepaged.c | 28 ++++++++++++++++------------
1 file changed, 16 insertions(+), 12 deletions(-)
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 0a1b4b484ac5..40215795d641 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1601,17 +1601,6 @@ static void collapse_file(struct mm_struct *mm,
result = SCAN_FAIL;
goto xa_unlocked;
}
- } else if (!PageUptodate(page)) {
- xas_unlock_irq(&xas);
- wait_on_page_locked(page);
- if (!trylock_page(page)) {
- result = SCAN_PAGE_LOCK;
- goto xa_unlocked;
- }
- get_page(page);
- } else if (PageDirty(page)) {
- result = SCAN_FAIL;
- goto xa_locked;
} else if (trylock_page(page)) {
get_page(page);
xas_unlock_irq(&xas);
@@ -1626,7 +1615,12 @@ static void collapse_file(struct mm_struct *mm,
* without racing with truncate.
*/
VM_BUG_ON_PAGE(!PageLocked(page), page);
- VM_BUG_ON_PAGE(!PageUptodate(page), page);
+
+ /* make sure the page is up to date */
+ if (unlikely(!PageUptodate(page))) {
+ result = SCAN_FAIL;
+ goto out_unlock;
+ }
/*
* If file was truncated then extended, or hole-punched, before
@@ -1642,6 +1636,16 @@ static void collapse_file(struct mm_struct *mm,
goto out_unlock;
}
+ if (!is_shmem && PageDirty(page)) {
+ /*
+ * khugepaged only works on read-only fd, so this
+ * page is dirty because it hasn't been flushed
+ * since first write.
+ */
+ result = SCAN_FAIL;
+ goto out_unlock;
+ }
+
if (isolate_lru_page(page)) {
result = SCAN_DEL_PAGE_LRU;
goto out_unlock;
--
2.17.1
Powered by blists - more mailing lists