[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c82a0dd7-a99b-6def-83d4-a19fbdd405d9@redhat.com>
Date: Fri, 17 Jan 2020 14:08:06 +0100
From: David Hildenbrand <david@...hat.com>
To: Michal Hocko <mhocko@...nel.org>
Cc: linux-kernel@...r.kernel.org, linux-mm@...ck.org,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Paul Mackerras <paulus@...ba.org>,
Michael Ellerman <mpe@...erman.id.au>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
"Rafael J. Wysocki" <rafael@...nel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Leonardo Bras <leonardo@...ux.ibm.com>,
Nathan Lynch <nathanl@...ux.ibm.com>,
Allison Randal <allison@...utok.net>,
Nathan Fontenot <nfont@...ux.vnet.ibm.com>,
Thomas Gleixner <tglx@...utronix.de>,
Dan Williams <dan.j.williams@...el.com>,
Stephen Rothwell <sfr@...b.auug.org.au>,
Anshuman Khandual <anshuman.khandual@....com>,
lantianyu1986@...il.com, linuxppc-dev@...ts.ozlabs.org
Subject: Re: [PATCH RFC v1] mm: is_mem_section_removable() overhaul
On 17.01.20 12:33, Michal Hocko wrote:
> On Fri 17-01-20 11:57:59, David Hildenbrand wrote:
>> Let's refactor that code. We want to check if we can offline memory
>> blocks. Add a new function is_mem_section_offlineable() for that and
>> make it call is_mem_section_offlineable() for each contained section.
>> Within is_mem_section_offlineable(), add some more sanity checks and
>> directly bail out if the section contains holes or if it spans multiple
>> zones.
>
> I didn't read the patch (yet) but I am wondering. If we want to touch
> this code, can we simply always return true there? I mean whoever
> depends on this check is racy and the failure can happen even after
> the sysfs says good to go, right? The check is essentially as expensive
> as calling the offlining code itself. So the only usecase I can think of
> is a dumb driver to crawl over blocks and check which is removable and
> try to hotremove it. But just trying to offline one block after another
> is essentially going to achieve the same.
Some thoughts:
1. It allows you to check if memory is likely to be offlineable without
doing expensive locking and trying to isolate pages (meaning:
zone->lock, mem_hotplug_lock. but also, calling drain_all_pages()
when isolating)
2. There are use cases that want to identify a memory block/DIMM to
unplug. One example is PPC DLPAR code (see this patch). Going over all
memory block trying to offline them is an expensive operation.
3. powerpc-utils (https://github.com/ibm-power-utilities/powerpc-utils)
makes use of /sys/.../removable to speed up the search AFAIK.
4. lsmem displays/groups by "removable".
5. If "removable=false" then it usually really is not offlineable.
Of course, there could also be races (free the last unmovable page),
but it means "don't even try". OTOH, "removable=true" is more racy,
and gives less guarantees. ("looks okay, feel free to try")
>
> Or does anybody see any reasonable usecase that would break if we did
> that unconditional behavior?
If we would return always "true", then the whole reason the
interface originally was introduced would be "broken" (meaning, less
performant as you would try to offline any memory block).
commit 5c755e9fd813810680abd56ec09a5f90143e815b
Author: Badari Pulavarty <pbadari@...ibm.com>
Date: Wed Jul 23 21:28:19 2008 -0700
memory-hotplug: add sysfs removable attribute for hotplug memory remove
Memory may be hot-removed on a per-memory-block basis, particularly on
POWER where the SPARSEMEM section size often matches the memory-block
size. A user-level agent must be able to identify which sections of
memory are likely to be removable before attempting the potentially
expensive operation. This patch adds a file called "removable" to the
memory directory in sysfs to help such an agent. In this patch, a memory
block is considered removable if;
I'd love to see this go away (just like "valid_zones"), but I don't
think it is possible.
So this patch makes it work a little more correctly (multiplezones, holes),
cleans up the code and avoids races with unplug code. It can, however,
not give more guarantees if memory offlining will actually succeed.
--
Thanks,
David / dhildenb
Powered by blists - more mailing lists