[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2e36a929-0fc7-d32a-d838-de746ff071fc@redhat.com>
Date: Thu, 10 Oct 2019 20:55:41 +0200
From: David Hildenbrand <david@...hat.com>
To: Qian Cai <cai@....pw>, akpm@...ux-foundation.org
Cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org,
"mhocko@...e.com" <mhocko@...e.com>
Subject: Re: [PATCH] mm/page_owner: fix a crash after memory offline
On 10.10.19 20:32, Qian Cai wrote:
> The linux-next series "mm/memory_hotplug: Shrink zones before removing
> memory" [1] seems make a crash easier to reproduce while reading
> /proc/pagetypeinfo after offlining a memory section. Fix it by using
> pfn_to_online_page() in the PFN walker.
Can you please rephrase the subject+description to describe the actual
problem and drop the reference to the series?
E.g., similar to my recent patches:
"mm/page_owner: Don't access uninitialized memmaps when reading
/proc/pagetypeinfo
Uninitialized memmaps contain garbage and in the worst case trigger
kernel BUGs, especially with CONFIG_PAGE_POISONING. They should not get
touched.
For example, when not onlining a memory block that is spanned by a zone
and reading /proc/pagetypeinfo, we can trigger a kernel BUG: ...
"
However, you also have to justify why it is okay to no longer consider
ZONE_DEVICE (I think walk_zones_in_node() will skip ZONE_DEVICE due to
assert_populated == true and ZONE_DEVICE will never be populated,
Therefore, we will never end in this code path with ZONE_DEVICE).
>
> [1] https://lore.kernel.org/linux-mm/20191006085646.5768-1-david@redhat.com/
>
> page:ffffea0021200000 is uninitialized and poisoned
> raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
> raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
> page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
> There is not page extension available.
> ------------[ cut here ]------------
> kernel BUG at include/linux/mm.h:1107!
> RIP: 0010:pagetypeinfo_showmixedcount_print+0x4fb/0x680
> Call Trace:
> walk_zones_in_node+0x3a/0xc0
> pagetypeinfo_show+0x260/0x2c0
> seq_read+0x27e/0x710
> proc_reg_read+0x12e/0x190
> __vfs_read+0x50/0xa0
> vfs_read+0xcb/0x1e0
> ksys_read+0xc6/0x160
> __x64_sys_read+0x43/0x50
> do_syscall_64+0xcc/0xaec
> entry_SYSCALL_64_after_hwframe+0x49/0xbe
>
> Signed-off-by: Qian Cai <cai@....pw>
> ---
> mm/page_owner.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/mm/page_owner.c b/mm/page_owner.c
> index dee931184788..03a6b19b3cdd 100644
> --- a/mm/page_owner.c
> +++ b/mm/page_owner.c
> @@ -296,11 +296,10 @@ void pagetypeinfo_showmixedcount_print(struct seq_file *m,
> pageblock_mt = get_pageblock_migratetype(page);
>
What about the pfn_valid() in the outermost loop? You can skip over the
whole pageblock if the first page is not online.
> for (; pfn < block_end_pfn; pfn++) {
> - if (!pfn_valid_within(pfn))
> + page = pfn_to_online_page(pfn);
> + if (!page)
> continue;
>
> - page = pfn_to_page(pfn);
> -
> if (page_zone(page) != zone)
> continue;
>
>
--
Thanks,
David / dhildenb
Powered by blists - more mailing lists