[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <DM5PR21MB0508CEC7F586EBC89D2CCFBB9D3F0@DM5PR21MB0508.namprd21.prod.outlook.com>
Date: Wed, 15 Aug 2018 14:05:35 +0000
From: Pavel Tatashin <Pavel.Tatashin@...rosoft.com>
To: "osalvador@...hadventures.net" <osalvador@...hadventures.net>,
"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>
CC: "mhocko@...e.com" <mhocko@...e.com>,
"dan.j.williams@...el.com" <dan.j.williams@...el.com>,
"pasha.tatashin@...cle.com" <pasha.tatashin@...cle.com>,
"jglisse@...hat.com" <jglisse@...hat.com>,
"david@...hat.com" <david@...hat.com>,
"yasu.isimatu@...il.com" <yasu.isimatu@...il.com>,
"logang@...tatee.com" <logang@...tatee.com>,
"dave.jiang@...el.com" <dave.jiang@...el.com>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Oscar Salvador <osalvador@...e.de>
Subject: RE: [RFC PATCH 0/3] Do not touch pages in remove_memory path
> This tries to fix [1], which was reported by David Hildenbrand, and also
> does some cleanups/refactoring.
Hi Oscar,
I would like to review this work. Are you in process of sending a new version? If so, I will wait for it.
Thank you,
Pavel
>
> I am sending this as RFC to see if the direction I am going is right before
> spending more time into it.
> And also to gather feedback about hmm/zone_device stuff.
> The code compiles and I tested it successfully with normal memory-hotplug
> operations.
>
> Here we go:
>
> With the following scenario:
>
> 1) We add memory
> 2) We do not online it
> 3) We remove the memory
>
> an invalid access is being made to those pages.
>
> The reason is that the pages are initialized in online_pages() path:
>
> / online_pages
> | move_pfn_range
> ONLINE | move_pfn_range_to_zone
> PAGES | ...
> | memmap_init_zone
>
> But depending on our policy about onlining the pages by default, we might not
> online them right after having added the memory, and so, those pages might
> be
> left unitialized.
>
> This is a problem because we access those pages in arch_remove_memory:
>
> ...
> if (altmap)
> page += vmem_altmap_offset(altmap);
> zone = page_zone(page);
> ...
>
> So we are accessing unitialized data basically.
>
>
> Currently, we need to have the zone from arch_remove_memory to all the
> way down
> because
>
> 1) we call __remove_zone zo shrink spanned pages from pgdat/zone
> 2) we get the pgdat from the zone
>
> Number 1 can be fixed by moving __remove_zone back to offline_pages(),
> where it should be.
> This, besides fixing the bug, will make the code more consistent because all
> the reveserse
> operations from online_pages() will be made in offline_pages().
>
> Number 2 can be fixed by passing nid instead of zone.
>
> The tricky part of all this is the hmm code and the zone_device stuff.
>
> Fixing the calls to arch_remove_memory in the arch code is easy, but
> arch_remove_memory
> is being used in:
>
> kernel/memremap.c: devm_memremap_pages_release()
> mm/hmm.c: hmm_devmem_release()
>
> I did my best to get my head around this, but my knowledge in that area is 0,
> so I am pretty sure
> I did not get it right.
>
> The thing is:
>
> devm_memremap_pages(), which is the counterpart of
> devm_memremap_pages_release(),
> calls arch_add_memory(), and then calls move_pfn_range_to_zone() (to
> ZONE_DEVICE).
> So it does not go through online_pages().
> So there I call shrink_pages() (it does pretty much as __remove_zone) before
> calling
> to arch_remove_memory.
> But as I said, I do now if that is right.
>
> [1] https://patchwork.kernel.org/patch/10547445/
>
> Oscar Salvador (3):
> mm/memory_hotplug: Add nid parameter to arch_remove_memory
> mm/memory_hotplug: Create __shrink_pages and move it to offline_pages
> mm/memory_hotplug: Refactor shrink_zone/pgdat_span
>
> arch/ia64/mm/init.c | 6 +-
> arch/powerpc/mm/mem.c | 13 +--
> arch/s390/mm/init.c | 2 +-
> arch/sh/mm/init.c | 6 +-
> arch/x86/mm/init_32.c | 6 +-
> arch/x86/mm/init_64.c | 10 +--
> include/linux/memory_hotplug.h | 8 +-
> kernel/memremap.c | 9 +-
> mm/hmm.c | 6 +-
> mm/memory_hotplug.c | 190 +++++++++++++++++++++--------------------
> mm/sparse.c | 4 +-
> 11 files changed, 127 insertions(+), 133 deletions(-)
>
> --
> 2.13.6
Powered by blists - more mailing lists