lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-Id: <1193411888.5032.9.camel@localhost> Date: Fri, 26 Oct 2007 11:18:08 -0400 From: Lee Schermerhorn <Lee.Schermerhorn@...com> To: Christoph Lameter <clameter@....com> Cc: David Rientjes <rientjes@...gle.com>, Andrew Morton <akpm@...ux-foundation.org>, Andi Kleen <ak@...e.de>, Paul Jackson <pj@....com>, linux-kernel@...r.kernel.org Subject: Re: [patch 2/2] cpusets: add interleave_over_allowed option On Thu, 2007-10-25 at 17:28 -0700, Christoph Lameter wrote: > On Thu, 25 Oct 2007, David Rientjes wrote: > > > The problem occurs when you add cpusets into the mix and permit the > > allowed nodes to change without knowledge to the application. Right now, > > a simple remap is done so if the cardinality of the set of nodes > > decreases, you're interleaving over a smaller number of nodes. If the > > cardinality increases, your interleaved nodemask isn't expanded. That's > > the problem that we're facing. The remap itself is troublesome because it > > doesn't take into account the user's desire for a custom nodemask to be > > used anyway; it could remap an interleaved policy over several nodes that > > will already be contended with one another. > > Right. So I think we are fine if the application cannot setup boundaries > for interleave. > > > > Normally, MPOL_INTERLEAVE is used to reduce bus contention to improve the > > throughput of the application. If you remap the number of nodes to > > interleave over, which is currently how it's done when mems_allowed > > changes, you could actually be increasing latency because you're > > interleaving over the same bus. > > Well you may hit some nodes more than others so a slight performance > degradataion. > > > This isn't a memory policy problem because all it does is effect a > > specific policy over a set of nodes. With my change, cpusets are required > > to update the interleaved nodemask if the user specified that they desire > > the feature with interleave_over_allowed. Cpusets are, after all, the > > ones that changed the mems_allowed in the first place and invalidated our > > custom interleave policy. We simply can't make inferences about what we > > should do, so we allow the creator of the cpuset to specify it for us. So > > the proper place to modify an interleaved policy is in cpusets and not > > mempolicy itself. > > With that MPOL_INTERLEAVE would be context dependent and no longer > needs translation. Lee had similar ideas. Lee: Could we make > MPOL_INTERLEAVE generally cpuset context dependent? > That's what my "cpuset-independent interleave" patch does. David doesn't like the "null node mask" interface because it doesn't work with libnuma. I plan to fix that, but I'm chasing other issues. I should get back to the mempol work after today. What I like about the cpuset independent interleave is that the "policy remap" when cpusets are changed is a NO-OP--no need to change the policy. Just as "preferred local" policy chooses the node where the allocation occurs, my cpuset independent interleave patch interleaves across the set of nodes available at the time of the allocation. The application has to specifically ask for this behavior by the null/empty nodemask or the TBD libnuma API. IMO, this is the only reasonable interleave policy for apps running in dynamic cpusets. An aside: if David et al [at google] are using cpusets on fake numa for resource management [I don't know this is the case, but saw some discussions way back that indicate it might be?], then maybe this becomes less of an issue when control groups [a.k.a. containers] and memory resource controls come to fruition? Lee - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists