[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <977b6c8b-2df3-5f4b-0d6c-fe766cf3fae0@intel.com>
Date: Tue, 29 Nov 2016 10:57:53 +0800
From: Aaron Lu <aaron.lu@...el.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Linux Memory Management List <linux-mm@...ck.org>,
Dave Hansen <dave.hansen@...el.com>,
Andrew Morton <akpm@...ux-foundation.org>,
"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
Huang Ying <ying.huang@...el.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: [PATCH] mremap: move_ptes: check pte dirty after its removal
On 11/29/2016 01:15 AM, Linus Torvalds wrote:
> However, I also independently think I found an actual bug while
> looking at the code as part of looking at the patch.
>
> This part looks racy:
>
> /*
> * We are remapping a dirty PTE, make sure to
> * flush TLB before we drop the PTL for the
> * old PTE or we may race with page_mkclean().
> */
> if (pte_present(*old_pte) && pte_dirty(*old_pte))
> force_flush = true;
> pte = ptep_get_and_clear(mm, old_addr, old_pte);
>
> where the issue is that another thread might make the pte be dirty (in
> the hardware walker, so no locking of ours make any difference)
> *after* we checked whether it was dirty, but *before* we removed it
> from the page tables.
Ah, very right. Thanks for the catch!
>
> So I think the "check for force-flush" needs to come *after*, and we should do
>
> pte = ptep_get_and_clear(mm, old_addr, old_pte);
> if (pte_present(pte) && pte_dirty(pte))
> force_flush = true;
>
> instead.
>
> This happens for the pmd case too.
Here is a fix patch, sorry for the trouble.
>From c0dc52fd3d3be93afb5b97804937a1b1b7ef136e Mon Sep 17 00:00:00 2001
From: Aaron Lu <aaron.lu@...el.com>
Date: Tue, 29 Nov 2016 10:33:37 +0800
Subject: [PATCH] mremap: move_ptes: check pte dirty after its removal
Linus found there still is a race in mremap after commit 5d1904204c99
("mremap: fix race between mremap() and page cleanning").
As described by Linus:
the issue is that another thread might make the pte be dirty (in
the hardware walker, so no locking of ours make any difference)
*after* we checked whether it was dirty, but *before* we removed it
from the page tables.
Fix it by moving the check after we removed it from the page table.
Suggested-by: Linus Torvalds <torvalds@...ux-foundation.org>
Signed-off-by: Aaron Lu <aaron.lu@...el.com>
---
mm/huge_memory.c | 2 +-
mm/mremap.c | 6 +++++-
2 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index eff3de359d50..a3e466c489a9 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1456,9 +1456,9 @@ bool move_huge_pmd(struct vm_area_struct *vma, unsigned long old_addr,
new_ptl = pmd_lockptr(mm, new_pmd);
if (new_ptl != old_ptl)
spin_lock_nested(new_ptl, SINGLE_DEPTH_NESTING);
+ pmd = pmdp_huge_get_and_clear(mm, old_addr, old_pmd);
if (pmd_present(*old_pmd) && pmd_dirty(*old_pmd))
force_flush = true;
- pmd = pmdp_huge_get_and_clear(mm, old_addr, old_pmd);
VM_BUG_ON(!pmd_none(*new_pmd));
if (pmd_move_must_withdraw(new_ptl, old_ptl) &&
diff --git a/mm/mremap.c b/mm/mremap.c
index 6ccecc03f56a..4b39dd0974e5 100644
--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -149,14 +149,18 @@ static void move_ptes(struct vm_area_struct *vma, pmd_t *old_pmd,
if (pte_none(*old_pte))
continue;
+ pte = ptep_get_and_clear(mm, old_addr, old_pte);
/*
* We are remapping a dirty PTE, make sure to
* flush TLB before we drop the PTL for the
* old PTE or we may race with page_mkclean().
+ *
+ * This check has to be done after we removed the
+ * old PTE from page tables or another thread may
+ * dirty it after the check and before the removal.
*/
if (pte_present(*old_pte) && pte_dirty(*old_pte))
force_flush = true;
- pte = ptep_get_and_clear(mm, old_addr, old_pte);
pte = move_pte(pte, new_vma->vm_page_prot, old_addr, new_addr);
pte = move_soft_dirty_pte(pte);
set_pte_at(mm, new_addr, new_pte, pte);
--
2.5.5
Powered by blists - more mailing lists