[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YY6wZMcx/BeddUnH@fedora>
Date: Fri, 12 Nov 2021 13:20:20 -0500
From: Dennis Zhou <dennis@...nel.org>
To: Michal Hocko <mhocko@...e.com>,
Andrew Morton <akpm@...ux-foundation.org>
Cc: linux-kernel@...r.kernel.org, amakhalov@...are.com, cl@...ux.com,
mm-commits@...r.kernel.org, osalvador@...e.de,
stable@...r.kernel.org, tj@...nel.org
Subject: Re: + mm-fix-panic-in-__alloc_pages.patch added to -mm tree
Hello,
On Tue, Nov 09, 2021 at 12:00:46PM +0100, Michal Hocko wrote:
> On Tue 09-11-21 09:42:56, David Hildenbrand wrote:
> > On 09.11.21 09:37, Michal Hocko wrote:
> > > I have opposed this patch http://lkml.kernel.org/r/YYj91Mkt4m8ySIWt@dhcp22.suse.cz
> > > There was no response to that feedback. I will not go as far as to nack
> > > it explicitly because pcp allocator is not an area I would nack patches
> > > but seriously, this issue needs a deeper look rather than a paper over
> > > patch. I hope we do not want to do a similar thing to all callers of
> > > cpu_to_mem.
> >
> > While we could move it into the !HOLES version of cpu_to_mem(), calling
> > cpu_to_mem() on an offline (and eventually not even present) CPU (with
> > an offline node) is really a corner case.
> >
> > Instead of additional runtime overhead for all cpu_to_mem(), my take
> > would be to just do it for the random special cases. Sure, we can
> > document that people should be careful when calling cpu_to_mem() on
> > offline CPUs. But IMHO it's really a corner case.
>
> I suspect I haven't made myself clear enough. I do not think we should
> be touching cpu_to_mem/cpu_to_node and handle this corner case. We
> should be looking at the underlying problem instead. We cannot really
> rely on cpu to be onlined to have a proper node association. We should
> really look at the initialization code and handle this situation
> properly. Memory less nodes are something we have been dealing with
> already. This particular instance of the problem is new and we should
> understand why.
> --
> Michal Hocko
> SUSE Labs
So I think we're still short a solution here. This patch solves the side
effect but not the underlying problem related to cpu hotplug.
I'm fine with this going in as a stop gap because I imagine the fixes to
hotplug are a lot more intrusive, but do we have someone who can own
that work to fix hotplug? I think that should be a requirement for
taking this because clearly it's hotplug that's broken and not percpu.
Acked-by: Dennis Zhou <dennis@...nel.org>
Thanks,
Dennis
Powered by blists - more mailing lists