[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <1467997624-9537-3-git-send-email-kamal@canonical.com>
Date: Fri, 8 Jul 2016 10:07:01 -0700
From: Kamal Mostafa <kamal@...onical.com>
To: linux-kernel@...r.kernel.org, stable@...r.kernel.org,
kernel-team@...ts.ubuntu.com
Cc: Hugh Dickins <hughd@...gle.com>, Christoph Lameter <cl@...ux.com>,
"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
Rik van Riel <riel@...hat.com>,
Vlastimil Babka <vbabka@...e.cz>,
Davidlohr Bueso <dave@...olabs.net>,
Oleg Nesterov <oleg@...hat.com>,
Sasha Levin <sasha.levin@...cle.com>,
Dmitry Vyukov <dvyukov@...gle.com>,
KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Ben Hutchings <ben@...adent.org.uk>,
Luis Henriques <luis.henriques@...onical.com>,
Kamal Mostafa <kamal@...onical.com>
Subject: [PATCH 4.2.y-ckt 2/5] mm: migrate dirty page without clear_page_dirty_for_io etc
4.2.8-ckt13 -stable review patch. If anyone has any objections, please let me know.
---8<------------------------------------------------------------
From: Hugh Dickins <hughd@...gle.com>
commit 42cb14b110a5698ccf26ce59c4441722605a3743 upstream.
clear_page_dirty_for_io() has accumulated writeback and memcg subtleties
since v2.6.16 first introduced page migration; and the set_page_dirty()
which completed its migration of PageDirty, later had to be moderated to
__set_page_dirty_nobuffers(); then PageSwapBacked had to skip that too.
No actual problems seen with this procedure recently, but if you look into
what the clear_page_dirty_for_io(page)+set_page_dirty(newpage) is actually
achieving, it turns out to be nothing more than moving the PageDirty flag,
and its NR_FILE_DIRTY stat from one zone to another.
It would be good to avoid a pile of irrelevant decrementations and
incrementations, and improper event counting, and unnecessary descent of
the radix_tree under tree_lock (to set the PAGECACHE_TAG_DIRTY which
radix_tree_replace_slot() left in place anyway).
Do the NR_FILE_DIRTY movement, like the other stats movements, while
interrupts still disabled in migrate_page_move_mapping(); and don't even
bother if the zone is the same. Do the PageDirty movement there under
tree_lock too, where old page is frozen and newpage not yet visible:
bearing in mind that as soon as newpage becomes visible in radix_tree, an
un-page-locked set_page_dirty() might interfere (or perhaps that's just
not possible: anything doing so should already hold an additional
reference to the old page, preventing its migration; but play safe).
But we do still need to transfer PageDirty in migrate_page_copy(), for
those who don't go the mapping route through migrate_page_move_mapping().
Signed-off-by: Hugh Dickins <hughd@...gle.com>
Cc: Christoph Lameter <cl@...ux.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
Cc: Rik van Riel <riel@...hat.com>
Cc: Vlastimil Babka <vbabka@...e.cz>
Cc: Davidlohr Bueso <dave@...olabs.net>
Cc: Oleg Nesterov <oleg@...hat.com>
Cc: Sasha Levin <sasha.levin@...cle.com>
Cc: Dmitry Vyukov <dvyukov@...gle.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
Signed-off-by: Andrew Morton <akpm@...ux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@...ux-foundation.org>
[bwh: Backported to 3.16: adjust context. This is not just an optimisation,
but turned out to fix a possible oops (CVE-2016-3070).]
Signed-off-by: Ben Hutchings <ben@...adent.org.uk>
Signed-off-by: Luis Henriques <luis.henriques@...onical.com>
Signed-off-by: Kamal Mostafa <kamal@...onical.com>
---
mm/migrate.c | 51 +++++++++++++++++++++++++++++++--------------------
1 file changed, 31 insertions(+), 20 deletions(-)
diff --git a/mm/migrate.c b/mm/migrate.c
index fcb6204..a14784c 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -30,6 +30,7 @@
#include <linux/mempolicy.h>
#include <linux/vmalloc.h>
#include <linux/security.h>
+#include <linux/backing-dev.h>
#include <linux/memcontrol.h>
#include <linux/syscalls.h>
#include <linux/hugetlb.h>
@@ -310,6 +311,8 @@ int migrate_page_move_mapping(struct address_space *mapping,
struct buffer_head *head, enum migrate_mode mode,
int extra_count)
{
+ struct zone *oldzone, *newzone;
+ int dirty;
int expected_count = 1 + extra_count;
void **pslot;
@@ -320,6 +323,9 @@ int migrate_page_move_mapping(struct address_space *mapping,
return MIGRATEPAGE_SUCCESS;
}
+ oldzone = page_zone(page);
+ newzone = page_zone(newpage);
+
spin_lock_irq(&mapping->tree_lock);
pslot = radix_tree_lookup_slot(&mapping->page_tree,
@@ -360,6 +366,13 @@ int migrate_page_move_mapping(struct address_space *mapping,
set_page_private(newpage, page_private(page));
}
+ /* Move dirty while page refs frozen and newpage not yet exposed */
+ dirty = PageDirty(page);
+ if (dirty) {
+ ClearPageDirty(page);
+ SetPageDirty(newpage);
+ }
+
radix_tree_replace_slot(pslot, newpage);
/*
@@ -369,6 +382,9 @@ int migrate_page_move_mapping(struct address_space *mapping,
*/
page_unfreeze_refs(page, expected_count - 1);
+ spin_unlock(&mapping->tree_lock);
+ /* Leave irq disabled to prevent preemption while updating stats */
+
/*
* If moved to a different zone then also account
* the page for that zone. Other VM counters will be
@@ -379,13 +395,19 @@ int migrate_page_move_mapping(struct address_space *mapping,
* via NR_FILE_PAGES and NR_ANON_PAGES if they
* are mapped to swap space.
*/
- __dec_zone_page_state(page, NR_FILE_PAGES);
- __inc_zone_page_state(newpage, NR_FILE_PAGES);
- if (!PageSwapCache(page) && PageSwapBacked(page)) {
- __dec_zone_page_state(page, NR_SHMEM);
- __inc_zone_page_state(newpage, NR_SHMEM);
+ if (newzone != oldzone) {
+ __dec_zone_state(oldzone, NR_FILE_PAGES);
+ __inc_zone_state(newzone, NR_FILE_PAGES);
+ if (PageSwapBacked(page) && !PageSwapCache(page)) {
+ __dec_zone_state(oldzone, NR_SHMEM);
+ __inc_zone_state(newzone, NR_SHMEM);
+ }
+ if (dirty && mapping_cap_account_dirty(mapping)) {
+ __dec_zone_state(oldzone, NR_FILE_DIRTY);
+ __inc_zone_state(newzone, NR_FILE_DIRTY);
+ }
}
- spin_unlock_irq(&mapping->tree_lock);
+ local_irq_enable();
return MIGRATEPAGE_SUCCESS;
}
@@ -509,20 +531,9 @@ void migrate_page_copy(struct page *newpage, struct page *page)
if (PageMappedToDisk(page))
SetPageMappedToDisk(newpage);
- if (PageDirty(page)) {
- clear_page_dirty_for_io(page);
- /*
- * Want to mark the page and the radix tree as dirty, and
- * redo the accounting that clear_page_dirty_for_io undid,
- * but we can't use set_page_dirty because that function
- * is actually a signal that all of the page has become dirty.
- * Whereas only part of our page may be dirty.
- */
- if (PageSwapBacked(page))
- SetPageDirty(newpage);
- else
- __set_page_dirty_nobuffers(newpage);
- }
+ /* Move dirty on pages not done by migrate_page_move_mapping() */
+ if (PageDirty(page))
+ SetPageDirty(newpage);
/*
* Copy NUMA information to the new page, to prevent over-eager
--
2.7.4
Powered by blists - more mailing lists