[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.0.9999.0710251648430.23810@chino.kir.corp.google.com>
Date: Thu, 25 Oct 2007 16:56:18 -0700 (PDT)
From: David Rientjes <rientjes@...gle.com>
To: Christoph Lameter <clameter@....com>
cc: Andrew Morton <akpm@...ux-foundation.org>, Andi Kleen <ak@...e.de>,
Paul Jackson <pj@....com>,
Lee Schermerhorn <Lee.Schermerhorn@...com>,
linux-kernel@...r.kernel.org
Subject: Re: [patch 2/2] cpusets: add interleave_over_allowed option
On Thu, 25 Oct 2007, Christoph Lameter wrote:
> More interactions between cpusets and memory policies. We have to be
> careful here to keep clean semantics.
>
I agree.
> Isnt it a bit surprising for an application that has set up a custom
> MPOL_INTERLEAVE policy if the nodes suddenly change because of a cpuset or
> mems_allowed change?
>
Every MPOL_INTERLEAVE policy is a custom policy that the application has
setup. If you don't use cpusets at all, the nodemask you pass to
set_mempolicy() with MPOL_INTERLEAVE is static and won't change without
the application's knowledge. It has full control over the nodemask that
it desires to interleave over.
The problem occurs when you add cpusets into the mix and permit the
allowed nodes to change without knowledge to the application. Right now,
a simple remap is done so if the cardinality of the set of nodes
decreases, you're interleaving over a smaller number of nodes. If the
cardinality increases, your interleaved nodemask isn't expanded. That's
the problem that we're facing. The remap itself is troublesome because it
doesn't take into account the user's desire for a custom nodemask to be
used anyway; it could remap an interleaved policy over several nodes that
will already be contended with one another.
Normally, MPOL_INTERLEAVE is used to reduce bus contention to improve the
throughput of the application. If you remap the number of nodes to
interleave over, which is currently how it's done when mems_allowed
changes, you could actually be increasing latency because you're
interleaving over the same bus.
This isn't a memory policy problem because all it does is effect a
specific policy over a set of nodes. With my change, cpusets are required
to update the interleaved nodemask if the user specified that they desire
the feature with interleave_over_allowed. Cpusets are, after all, the
ones that changed the mems_allowed in the first place and invalidated our
custom interleave policy. We simply can't make inferences about what we
should do, so we allow the creator of the cpuset to specify it for us. So
the proper place to modify an interleaved policy is in cpusets and not
mempolicy itself.
David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists