lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20150528181835.GI5989@linux.vnet.ibm.com>
Date:	Thu, 28 May 2015 11:18:35 -0700
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	Benjamin Herrenschmidt <benh@...nel.crashing.org>
Cc:	Vlastimil Babka <vbabka@...e.cz>, Christoph Lameter <cl@...ux.com>,
	Rik van Riel <riel@...hat.com>,
	Jerome Glisse <j.glisse@...il.com>,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	jglisse@...hat.com, mgorman@...e.de, aarcange@...hat.com,
	airlied@...hat.com, aneesh.kumar@...ux.vnet.ibm.com,
	Cameron Buschardt <cabuschardt@...dia.com>,
	Mark Hairgrove <mhairgrove@...dia.com>,
	Geoffrey Gerfin <ggerfin@...dia.com>,
	John McKenna <jmckenna@...dia.com>, akpm@...ux-foundation.org,
	laijs@...fujitsu.com
Subject: Re: Interacting with coherent memory on external devices

On Thu, May 14, 2015 at 05:51:19PM +1000, Benjamin Herrenschmidt wrote:
> On Thu, 2015-05-14 at 09:39 +0200, Vlastimil Babka wrote:
> > On 05/14/2015 01:38 AM, Benjamin Herrenschmidt wrote:
> > > On Wed, 2015-05-13 at 16:10 +0200, Vlastimil Babka wrote:
> > >> Sorry for reviving oldish thread...
> > >
> > > Well, that's actually appreciated since this is constructive discussion
> > > of the kind I was hoping to trigger initially :-) I'll look at
> > 
> > I hoped so :)
> > 
> > > ZONE_MOVABLE, I wasn't aware of its existence.
> > >
> > > Don't we still have the problem that ZONEs must be somewhat contiguous
> > > chunks ? Ie, my "CAPI memory" will be interleaved in the physical
> > > address space somewhat.. This is due to the address space on some of
> > > those systems where you'll basically have something along the lines of:
> > >
> > > [ node 0 mem ] [ node 0 CAPI dev ] .... [ node 1 mem] [ node 1 CAPI dev] ...
> > 
> > Oh, I see. The VM code should cope with that, but some operations would 
> > be inefficiently looping over the holes in the CAPI zone by 2MB 
> > pageblock per iteration. This would include compaction scanning, which 
> > would suck if you need those large contiguous allocations as you said. 
> > Interleaving works better if it's done with a smaller granularity.
> > 
> > But I guess you could just represent the CAPI as multiple NUMA nodes, 
> > each with single ZONE_MOVABLE zone. Especially if "node 0 CAPI dev" and 
> > "node 1 CAPI dev" differs in other characteristics than just using a 
> > different range of PFNs... otherwise what's the point of this split anyway?
> 
> Correct, I think we want the CAPI devs to look like CPU-less NUMA nodes
> anyway. This is the right way to target an allocation at one of them and
> it conveys the distance properly, so it makes sense.
> 
> I'll add the ZONE_MOVABLE to the list of things to investigate on our
> side, thanks for the pointer !

Any thoughts on CONFIG_MOVABLE_NODE and the corresponding "movable_node"
boot parameter?  It looks like it is designed to make an entire NUMA
node's memory hotpluggable, which seems consistent with what we are
trying to do here.  This feature is currently x86_64-only, so would need
to be enabled on other architectures.

It looks like this is intended to be used by booting normally, but
keeping the CAPI nodes' memory offline, setting movable_node, then
onlining their memory.

Thoughts?

							Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ