[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190516111607.GR16651@dhcp22.suse.cz>
Date: Thu, 16 May 2019 13:16:07 +0200
From: Michal Hocko <mhocko@...nel.org>
To: Anshuman Khandual <anshuman.khandual@....com>
Cc: Mark Rutland <mark.rutland@....com>, linux-kernel@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org, akpm@...ux-foundation.org,
catalin.marinas@....com, will.deacon@....com,
mgorman@...hsingularity.net, james.morse@....com,
robin.murphy@....com, cpandya@...eaurora.org,
arunks@...eaurora.org, dan.j.williams@...el.com, osalvador@...e.de,
david@...hat.com, cai@....pw, logang@...tatee.com,
ira.weiny@...el.com
Subject: Re: [PATCH V3 2/4] arm64/mm: Hold memory hotplug lock while walking
for kernel page table dump
On Thu 16-05-19 16:36:12, Anshuman Khandual wrote:
> On 05/16/2019 03:53 PM, Mark Rutland wrote:
> > Hi Michal,
> >
> > On Wed, May 15, 2019 at 06:58:47PM +0200, Michal Hocko wrote:
> >> On Tue 14-05-19 14:30:05, Anshuman Khandual wrote:
> >>> The arm64 pagetable dump code can race with concurrent modification of the
> >>> kernel page tables. When a leaf entries are modified concurrently, the dump
> >>> code may log stale or inconsistent information for a VA range, but this is
> >>> otherwise not harmful.
> >>>
> >>> When intermediate levels of table are freed, the dump code will continue to
> >>> use memory which has been freed and potentially reallocated for another
> >>> purpose. In such cases, the dump code may dereference bogus addressses,
> >>> leading to a number of potential problems.
> >>>
> >>> Intermediate levels of table may by freed during memory hot-remove, or when
> >>> installing a huge mapping in the vmalloc region. To avoid racing with these
> >>> cases, take the memory hotplug lock when walking the kernel page table.
> >>
> >> Why is this a problem only on arm64
> >
> > It looks like it's not -- I think we're just the first to realise this.
> >
> > AFAICT x86's debugfs ptdump has the same issue if run conccurently with
> > memory hot remove. If 32-bit arm supported hot-remove, its ptdump code
> > would have the same issue.
> >
> >> and why do we even care for debugfs? Does anybody rely on this thing
> >> to be reliable? Do we even need it? Who is using the file?
> >
> > The debugfs part is used intermittently by a few people working on the
> > arm64 kernel page tables. We use that both to sanity-check that kernel
> > page tables are created/updated correctly after changes to the arm64 mmu
> > code, and also to debug issues if/when we encounter issues that appear
> > to be the result of kernel page table corruption.
> >
> > So while it's rare to need it, it's really useful to have when we do
> > need it, and I'd rather not remove it. I'd also rather that it didn't
> > have latent issues where we can accidentally crash the kernel when using
> > it, which is what this patch is addressing.
> >
> >> I am asking because I would really love to make mem hotplug locking less
> >> scattered outside of the core MM than more. Most users simply shouldn't
> >> care. Pfn walkers should rely on pfn_to_online_page.
> >
> > I'm not sure if that would help us here; IIUC pfn_to_online_page() alone
> > doesn't ensure that the page remains online. Is there a way to achieve
> > that other than get_online_mems()?
>
> Still wondering how pfn_to_online_page() is applicable here. It validates
> a given PFN and whether its online from sparse section mapping perspective
> before giving it's struct page. IIUC it is used during a linear scanning
> of a physical address range not for a page table walk. So how it can solve
> the problem when a struct page which was used as an intermediate level page
> table page gets released back to the buddy from another concurrent thread ?
Well, my comment about pfn_to_online_page was more generic and it might
not apply to this specific case. I meant to say that the code outside of
the core MM shouldn't really care about the hotplug locking.
--
Michal Hocko
SUSE Labs
Powered by blists - more mailing lists