[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20200903123136.1fa50e773eb58c6200801e65@linux-foundation.org>
Date: Thu, 3 Sep 2020 12:31:36 -0700
From: Andrew Morton <akpm@...ux-foundation.org>
To: David Hildenbrand <david@...hat.com>
Cc: Pavel Tatashin <pasha.tatashin@...een.com>,
linux-kernel@...r.kernel.org, mhocko@...e.com, linux-mm@...ck.org,
osalvador@...e.de, richard.weiyang@...il.com, vbabka@...e.cz,
rientjes@...gle.com
Subject: Re: [PATCH v2] mm/memory_hotplug: drain per-cpu pages again during
memory offline
On Thu, 3 Sep 2020 19:36:26 +0200 David Hildenbrand <david@...hat.com> wrote:
> (still on vacation, back next week on Tuesday)
>
> I didn't look into discussions in v1, but to me this looks like we are
> trying to hide an actual bug by implementing hacks in the caller
> (repeated calls to drain_all_pages()). What about alloc_contig_range()
> users - you get more allocation errors just because PCP code doesn't
> play along.
>
> There *is* strong synchronization with the page allocator - however,
> there seems to be one corner case race where we allow to allocate pages
> from isolated pageblocks.
>
> I want that fixed instead if possible, otherwise this is just an ugly
> hack to make the obvious symptoms (offlining looping forever) disappear.
>
> If that is not possible easily, I'd much rather want to see all
> drain_all_pages() calls being moved to the caller and have the expected
> behavior documented instead of specifying "there is no strong
> synchronization with the page allocator" - which is wrong in all but PCP
> cases (and there only in one possible race?).
>
It's a two-line hack which fixes a bug in -stable kernels, so I'm
inclined to proceed with it anyway. We can undo it later on as part of
a better fix, OK?
Unless you think there's some new misbehaviour which we might see as a
result of this approach?
Powered by blists - more mailing lists