[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6b02e1f4-a68f-787d-fbde-ec081ebba058@oracle.com>
Date: Thu, 15 Feb 2018 10:05:19 -0500
From: Pavel Tatashin <pasha.tatashin@...cle.com>
To: Michal Hocko <mhocko@...nel.org>
Cc: Steve Sistare <steven.sistare@...cle.com>,
Daniel Jordan <daniel.m.jordan@...cle.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Mel Gorman <mgorman@...hsingularity.net>,
Linux Memory Management List <linux-mm@...ck.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Vlastimil Babka <vbabka@...e.cz>,
Bharata B Rao <bharata@...ux.vnet.ibm.com>,
Thomas Gleixner <tglx@...utronix.de>, mingo@...hat.com,
hpa@...or.com, x86@...nel.org, dan.j.williams@...el.com,
kirill.shutemov@...ux.intel.com, bhe@...hat.com
Subject: Re: [PATCH v3 1/4] mm/memory_hotplug: enforce block size aligned
range check
> No, not really. I just think the alignment shouldn't really matter. Each
> memory block should simply represent a hotplugable entitity with a well
> defined pfn start and size (in multiples of section size). This is in
> fact what we do internally anyway. One problem might be that an existing
> userspace might depend on the existing size restrictions so we might not
> be able to have variable block sizes. But block size alignment should be
> fixable.
>
Hi Michal,
I see what you mean, and I agree Linux should simply honor reasonable
requests from HW/HV.
On x86 qemu hotplugable entity is 128M, on sun4v SPARC it is 256M, with
current scheme we still would end up with huge number of memory devices
in sysfs if block size is fixed and equal to minimum hotplugable
entitity. Just as an example, SPARC sun4v may have logical domains up-to
32T, with 256M granularity that is 131K files in
/sys/devices/system/memory/!
But, if it is variable, I am not sure how to solve it. The whole
interface must be redefined. Because even if we hotplugged a highly
aligned large chunk of memory and created only one memory device for it,
we should have a way to remove just a small piece of that memory if
underlying HV/HW requested.
/sys/devices/system/memory/block_size_bytes
Would have to be moved into memory block
echo offline > /sys/devices/system/memory/memoryXXX/state
This would need to be redefined somehow to work only on part of the block.
I am not really sure what a good solution would be without breaking the
userspace.
Thank you,
Pavel
Powered by blists - more mailing lists