[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20190226124649.GH11981@rapoport-lnx>
Date: Tue, 26 Feb 2019 14:46:49 +0200
From: Mike Rapoport <rppt@...ux.ibm.com>
To: Sasha Levin <sashal@...nel.org>
Cc: linux-kernel@...r.kernel.org, stable@...r.kernel.org,
Michal Hocko <mhocko@...e.com>,
Pavel Tatashin <pasha.tatashin@...een.com>,
Heiko Carstens <heiko.carstens@...ibm.com>,
Martin Schwidefsky <schwidefsky@...ibm.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
linux-mm@...ck.org
Subject: Re: [PATCH AUTOSEL 4.20 66/72] mm, memory_hotplug:
is_mem_section_removable do not pass the end of a zone
On Sat, Feb 23, 2019 at 04:04:16PM -0500, Sasha Levin wrote:
> From: Michal Hocko <mhocko@...e.com>
>
> [ Upstream commit efad4e475c312456edb3c789d0996d12ed744c13 ]
There is a fix for this fix [1].
It's commit 891cb2a72d821f930a39d5900cb7a3aa752c1d5b ("mm, memory_hotplug:
fix off-by-one in is_pageblock_removable") in mainline.
[1] https://lore.kernel.org/lkml/20190218181544.14616-1-mhocko@kernel.org/
> Patch series "mm, memory_hotplug: fix uninitialized pages fallouts", v2.
>
> Mikhail Zaslonko has posted fixes for the two bugs quite some time ago
> [1]. I have pushed back on those fixes because I believed that it is
> much better to plug the problem at the initialization time rather than
> play whack-a-mole all over the hotplug code and find all the places
> which expect the full memory section to be initialized.
>
> We have ended up with commit 2830bf6f05fb ("mm, memory_hotplug:
> initialize struct pages for the full memory section") merged and cause a
> regression [2][3]. The reason is that there might be memory layouts
> when two NUMA nodes share the same memory section so the merged fix is
> simply incorrect.
>
> In order to plug this hole we really have to be zone range aware in
> those handlers. I have split up the original patch into two. One is
> unchanged (patch 2) and I took a different approach for `removable'
> crash.
>
> [1] http://lkml.kernel.org/r/20181105150401.97287-2-zaslonko@linux.ibm.com
> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1666948
> [3] http://lkml.kernel.org/r/20190125163938.GA20411@dhcp22.suse.cz
>
> This patch (of 2):
>
> Mikhail has reported the following VM_BUG_ON triggered when reading sysfs
> removable state of a memory block:
>
> page:000003d08300c000 is uninitialized and poisoned
> page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
> Call Trace:
> is_mem_section_removable+0xb4/0x190
> show_mem_removable+0x9a/0xd8
> dev_attr_show+0x34/0x70
> sysfs_kf_seq_show+0xc8/0x148
> seq_read+0x204/0x480
> __vfs_read+0x32/0x178
> vfs_read+0x82/0x138
> ksys_read+0x5a/0xb0
> system_call+0xdc/0x2d8
> Last Breaking-Event-Address:
> is_mem_section_removable+0xb4/0x190
> Kernel panic - not syncing: Fatal exception: panic_on_oops
>
> The reason is that the memory block spans the zone boundary and we are
> stumbling over an unitialized struct page. Fix this by enforcing zone
> range in is_mem_section_removable so that we never run away from a zone.
>
> Link: http://lkml.kernel.org/r/20190128144506.15603-2-mhocko@kernel.org
> Signed-off-by: Michal Hocko <mhocko@...e.com>
> Reported-by: Mikhail Zaslonko <zaslonko@...ux.ibm.com>
> Debugged-by: Mikhail Zaslonko <zaslonko@...ux.ibm.com>
> Tested-by: Gerald Schaefer <gerald.schaefer@...ibm.com>
> Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@...il.com>
> Reviewed-by: Oscar Salvador <osalvador@...e.de>
> Cc: Pavel Tatashin <pasha.tatashin@...een.com>
> Cc: Heiko Carstens <heiko.carstens@...ibm.com>
> Cc: Martin Schwidefsky <schwidefsky@...ibm.com>
> Signed-off-by: Andrew Morton <akpm@...ux-foundation.org>
> Signed-off-by: Linus Torvalds <torvalds@...ux-foundation.org>
> Signed-off-by: Sasha Levin <sashal@...nel.org>
> ---
> mm/memory_hotplug.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 21d94b5677e81..5ce0d929ff482 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1234,7 +1234,8 @@ static bool is_pageblock_removable_nolock(struct page *page)
> bool is_mem_section_removable(unsigned long start_pfn, unsigned long nr_pages)
> {
> struct page *page = pfn_to_page(start_pfn);
> - struct page *end_page = page + nr_pages;
> + unsigned long end_pfn = min(start_pfn + nr_pages, zone_end_pfn(page_zone(page)));
> + struct page *end_page = pfn_to_page(end_pfn);
>
> /* Check the starting page of each pageblock within the range */
> for (; page < end_page; page = next_active_pageblock(page)) {
> --
> 2.19.1
>
--
Sincerely yours,
Mike.
Powered by blists - more mailing lists