linux-kernel - Re: [RFC][PATCH 00/26] sched/numa

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4F672384.1030601@redhat.com>
Date:	Mon, 19 Mar 2012 14:16:04 +0200
From:	Avi Kivity <avi@...hat.com>
To:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
CC:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...e.hu>, Paul Turner <pjt@...gle.com>,
	Suresh Siddha <suresh.b.siddha@...el.com>,
	Mike Galbraith <efault@....de>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Lai Jiangshan <laijs@...fujitsu.com>,
	Dan Smith <danms@...ibm.com>,
	Bharata B Rao <bharata.rao@...il.com>,
	Lee Schermerhorn <Lee.Schermerhorn@...com>,
	Andrea Arcangeli <aarcange@...hat.com>,
	Rik van Riel <riel@...hat.com>,
	Johannes Weiner <hannes@...xchg.org>,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [RFC][PATCH 00/26] sched/numa

On 03/19/2012 02:09 PM, Peter Zijlstra wrote:
> On Mon, 2012-03-19 at 13:42 +0200, Avi Kivity wrote:
> > > That's intentional, it keeps the work accounted to the tasks that need
> > > it.
> > 
> > The accounting part is good, the extra latency is not.  If you have
> > spare resources (processors or dma engines) you can employ for eager
> > migration why not make use of them.
>
> Afaik we do not use dma engines for memory migration. 

We don't, but I think we should.

> In any case, if you do cross-node migration frequently enough that the
> overhead of copying pages is a significant part of your time then I'm
> guessing there's something wrong.
>
> If not, the latency should be armortised enough to not matter.

Amortization is okay for HPC style applications but not for interactive
applications (including servers).  It all depends on the numbers of
course, maybe migrate on fault is okay, we'll need to measure it somehow.

> > > > - doesn't work with dma engines
> > >
> > > How does that work anyway? You'd have to reprogram your dma engine, so
> > > either the ->migratepage() callback does that and we're good either way,
> > > or it simply doesn't work at all.
> > 
> > If it's called from the faulting task's context you have to sleep, and
> > the latency gets increased even more, plus you're dependant on the dma
> > engine's backlog.  If you do all that from a background thread you don't
> > have to block (you might have to cancel or discard a migration if the
> > page was changed while being copied). 
>
> The current MoF implementation simply bails and uses the old page. It
> will never block.

Then it can not use a dma engine.

> Its all a best effort approach, a 'few' stray pages is OK as long as the
> bulk of the pages are local.
>
> If you're concerned, we can add per mm/vma counters to track this.

These are second and third order effects.  Overall I'm happy, kvm is one
of the workloads most severely impacted by the current numa support and
this looks like it addresses most of the issues.

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/