[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <bde72792-9116-f28d-f252-084345631f15@arm.com>
Date: Fri, 8 Feb 2019 08:33:34 +0530
From: Anshuman Khandual <anshuman.khandual@....com>
To: Robin Murphy <robin.murphy@....com>,
Oscar Salvador <osalvador@...e.de>
Cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org,
gregkh@...uxfoundation.org, rafael@...nel.org, mhocko@...nel.org,
akpm@...ux-foundation.org
Subject: Re: [PATCH] mm/memory-hotplug: Add sysfs hot-remove trigger
On 02/07/2019 09:02 PM, Robin Murphy wrote:
> On 07/02/2019 13:36, Oscar Salvador wrote:
>> On Wed, Feb 06, 2019 at 05:03:53PM +0000, Robin Murphy wrote:
>>> ARCH_MEMORY_PROBE is a useful thing for testing and debugging hotplug,
>>> but being able to exercise the (arguably trickier) hot-remove path would
>>> be even more useful. Extend the feature to allow removal of offline
>>> sections to be triggered manually to aid development.
>>>
>>> Signed-off-by: Robin Murphy <robin.murphy@....com>
>>> ---
>>>
>>> This is inspired by a previous proposal[1], but in coming up with a
>>> more robust interface I ended up rewriting the whole thing from
>>> scratch. The lack of documentation is semi-deliberate, since I don't
>>> like the idea of anyone actually relying on this interface as ABI, but
>>> as a handy tool it felt useful enough to be worth sharing :)
>>
>> Hi Robin,
>>
>> I think this might come in handy, especially when trying to test hot-remove
>> on arch's that do not have any means to hot-remove memory, or even on virtual
>> platforms that do not have yet support for hot-remove depending on the platform,
>> like qemu/arm64.
>>
>>
>> I could have used this while testing hot-remove on other archs for [1]
>>
>>>
>>> Robin.
>>>
>>> [1] https://lore.kernel.org/lkml/22d34fe30df0fbacbfceeb47e20cb1184af73585.1511433386.git.ar@linux.vnet.ibm.com/
>>>
>>
>>> + if (mem->state != MEM_OFFLINE)
>>> + return -EBUSY;
>>
>> We do have the helper "is_memblock_offlined()", although it is only used in one place now.
>> So, I would rather use it here as well.
>
> Ooh, if I'd actually noticed that that helper existed, I would indeed have used it - fixed.
>
>>> +
>>> + ret = lock_device_hotplug_sysfs();
>>> + if (ret)
>>> + return ret;
>>> +
>>> + if (device_remove_file_self(dev, attr)) {
>>> + __remove_memory(pfn_to_nid(start_pfn), PFN_PHYS(start_pfn),
>>> + MIN_MEMORY_BLOCK_SIZE * sections_per_block);
>>
>> Sorry, I am not into sysfs inners, but I thought that:
>> device_del::device_remove_attrs::device_remove_groups::sysfs_remove_groups
>> would be enough to remove the dev attributes.
>> I guess in this case that is not enough, could you explain why?
>
> As I found out the hard way, since the "remove" attribute itself belongs to the device being removed, the standard device teardown callchain would end up trying to remove the file from its own method, which results in deadlock. Fortunately, the PCI sysfs code has a similar "remove" attribute which showed me how it should be handled - following the kerneldoc breadcrumb trail to kernfs_remove_self() hopefully explains it more completely.
Instead we could have an interface like /sys/devices/system/memory/[unprobe|remove]
which would take a memory block start address looking into the existing ones and
attempt to remove [addr, addr + memory_block_size] provided its already offlined.
This will be exact opposite for /sys/devices/system/memory/probe except the fact
that it can trigger onlining of the memory automatically (even this new one could
trigger offlining automatically as well). But I dont have a preference between the
proposed one or this one. Either of them should be okay.
Powered by blists - more mailing lists