lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 14 Oct 2008 14:06:21 -0700
From:	Gary Hade <garyhade@...ibm.com>
To:	Yasunori Goto <y-goto@...fujitsu.com>
Cc:	Gary Hade <garyhade@...ibm.com>,
	Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org, pbadari@...ibm.com, mel@....ul.ie,
	lcm@...ibm.com, mingo@...e.hu, greg@...ah.com,
	dave@...ux.vnet.ibm.com, nish.aravamudan@...il.com
Subject: Re: [PATCH 1/2] [REPOST] mm: show node to memory section
	relationship with symlinks in sysfs

On Tue, Oct 14, 2008 at 08:54:21PM +0900, Yasunori Goto wrote:
> > On Fri, Oct 10, 2008 at 04:32:30PM -0700, Andrew Morton wrote:
> > > On Fri, 10 Oct 2008 16:18:44 -0700
> > > Gary Hade <garyhade@...ibm.com> wrote:
> > > 
> > > > On Fri, Oct 10, 2008 at 02:59:50PM -0700, Andrew Morton wrote:
> > > > > On Fri, 10 Oct 2008 14:33:57 -0700
> > > > > Gary Hade <garyhade@...ibm.com> wrote:
> > > > > 
> > > > > > On Fri, Oct 10, 2008 at 12:42:39PM -0700, Andrew Morton wrote:
> > > > > > > On Thu, 9 Oct 2008 12:21:15 -0700
> > > > > > > Gary Hade <garyhade@...ibm.com> wrote:
> > > > > > > 
> > > > > > > > Show node to memory section relationship with symlinks in sysfs
> > > > > > > > 
> > > > > > > > Add /sys/devices/system/node/nodeX/memoryY symlinks for all
> > > > > > > > the memory sections located on nodeX.  For example:
> > > > > > > > /sys/devices/system/node/node1/memory135 -> ../../memory/memory135
> > > > > > > > indicates that memory section 135 resides on node1.
> > > > > > > 
> > > > > > > I'm not seeing here a description of why the kernel needs this feature.
> > > > > > > Why is it useful?  How will it be used?  What value does it have to
> > > > > > > our users?
> > > > > > 
> > > > > > Sorry, I should have included that.  In our case, it is another
> > > > > > small step towards eventual total node removal.  We will need to
> > > > > > know which memory sections to offline for whatever node is targeted
> > > > > > for removal.  However, I suspect that exposing the node to section
> > > > > > information to user-level could be useful for other purposes.
> > > > > > For example, I have been thinking that using memory hotremove
> > > > > > functionality to modify the amount of available memory on specific
> > > > > > nodes without having to physically add/remove DIMMs might be useful
> > > > > > to those that test application or benchmark performance on a
> > > > > > multi-node system in various memory configurations.
> > > > > > 
> > > > > 
> > > > > hm, OK, thanks.  It does sound a bit thin, and if we merge this then
> > > > > not only do we get a porkier kernel,
> > > > 
> > > > Would you feel the same about the size increase if patch 2/2 (include
> > > > memory section subtree in sysfs with only sparsemem enabled) was
> > > > withdrawn?
> > > > 
> > > > Without patch 2/2 the size increase for non-Sparsemem or Sparsemem
> > > > wo/memory hotplug kernels is extremely small.  Even for memory hotplug
> > > > enabled kernels there is only a little extra code in ./drivers/base/node.o
> > > > which only gets linked into NUMA enabled kernels.  I can gather some numbers
> > > > if necessary.
> > > 
> > > Size is probably a minor issue on memory-hotpluggable machines.
> > > 
> > > > > we also get a new userspace interface which we're then locked into.
> > > > 
> > > > True.
> > > 
> > > That's a bigger issue.  The later we leave this sort of thing, the more
> > > information we have.
> > 
> > I understand your concerns about adding possibly frivolous interfaces
> > but in this case we are simply eliminating a very obvious hole in the
> > existing set of memory hot-add/remove interfaces.  In general, it
> > makes absolutely no sense to provide a resource add/remove mechanism
> > without telling the user where the resource is physically located.
> > i.e. providing the _maximum_ possible amount of location information
> > available for the add/remove controllable resource.  This is especially
> > critical for large multi-node systems and for resources that can impact
> > application or overall system performance.
> > 
> > The kernel already exports node location information for CPUs
> > (e.g. /sys/devices/system/node/node0/cpu0 -> ../../cpu/cpu0) and
> > PCI devices (e.g. ./devices/pci0000:00/0000:00:00.0/numa_node).
> > Why should memory be treated any differently?
> > 
> > The memory hot-add/remove interfaces include physical device files
> > (e.g. /sys/devices/system/memory/memory0/phys_device) which are not
> > yet fully implemented.  When systems that support removable memory
> > modules force this interface to mature, node location information
> > will become even more critical.  This feature will not be very useful
> > on multi-node systems if the user does not know what node a specific
> > memory module is installed in.  It may be possible to encode the
> > node ID into the string provided by the phys_device file but a
> > more general node to memory section association as provided by this
> > patch is better since it can be used for other purposes.
> 
> Sorry for late responce.
> 
> Our fujitsu box can hot-add a node. This means a user/script has to
> find which memory sections and cpus belong to added node when node hot-add
> is executed. 
> 
> Current my hotplug script is very poor. It onlines all offlined cpus and memories. 
> However if user offlined one memory section intentionally due to
> memory error message, the script can't understand it is intended, and hot-add
> the error section. I think this is one of reason why this link is necessary.

When it comes time to replace that misbehaving memory I bet the
user will be delighted to know which node it is in.  This sort
of benefit will of course _not_ limited to the small community
of node hot-add/remove capable systems.

Gary

-- 
Gary Hade
System x Enablement
IBM Linux Technology Center
503-578-4503  IBM T/L: 775-4503
garyhade@...ibm.com
http://www.ibm.com/linux/ltc

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists