[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20101216153100.GC1687@nowhere>
Date: Thu, 16 Dec 2010 16:31:03 +0100
From: Frederic Weisbecker <fweisbec@...il.com>
To: Peter Zijlstra <a.p.zijlstra@...llo.nl>
Cc: Chris Mason <chris.mason@...cle.com>,
Frank Rowand <frank.rowand@...sony.com>,
Ingo Molnar <mingo@...e.hu>,
Thomas Gleixner <tglx@...utronix.de>,
Mike Galbraith <efault@....de>,
Oleg Nesterov <oleg@...hat.com>, Paul Turner <pjt@...gle.com>,
Jens Axboe <axboe@...nel.dk>, linux-kernel@...r.kernel.org
Subject: Re: [RFC][PATCH 5/5] sched: Reduce ttwu rq->lock contention
On Thu, Dec 16, 2010 at 03:56:07PM +0100, Peter Zijlstra wrote:
> Reduce rq->lock contention on try_to_wake_up() by changing the task
> state using a cmpxchg loop.
>
> Once the task is set to TASK_WAKING we're guaranteed the only one
> poking at it, then proceed to pick a new cpu without holding the
> rq->lock (XXX this opens some races).
>
> Then instead of locking the remote rq and activating the task, place
> the task on a remote queue, again using cmpxchg, and notify the remote
> cpu per IPI if this queue was empty to start processing its wakeups.
>
> This avoids (in most cases) having to lock the remote runqueue (and
> therefore the exclusive cacheline transfer thereof) but also touching
> all the remote runqueue data structures needed for the actual
> activation.
>
> As measured using: http://oss.oracle.com/~mason/sembench.c
>
> $ echo 4096 32000 64 128 > /proc/sys/kernel/sem
> $ ./sembench -t 2048 -w 1900 -o 0
>
> unpatched: run time 30 seconds 537953 worker burns per second
> patched: run time 30 seconds 657336 worker burns per second
>
> Still need to sort out all the races marked XXX (non-trivial), and its
> x86 only for the moment.
>
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@...llo.nl>
> ---
> arch/x86/kernel/smp.c | 1
> include/linux/sched.h | 7 -
> kernel/sched.c | 241 ++++++++++++++++++++++++++++++++++--------------
> kernel/sched_fair.c | 5
> kernel/sched_features.h | 3
> kernel/sched_idletask.c | 2
> kernel/sched_rt.c | 4
> kernel/sched_stoptask.c | 3
> 8 files changed, 190 insertions(+), 76 deletions(-)
>
> Index: linux-2.6/arch/x86/kernel/smp.c
> ===================================================================
> --- linux-2.6.orig/arch/x86/kernel/smp.c
> +++ linux-2.6/arch/x86/kernel/smp.c
> @@ -205,6 +205,7 @@ void smp_reschedule_interrupt(struct pt_
> /*
> * KVM uses this interrupt to force a cpu out of guest mode
> */
> + sched_ttwu_pending();
> }
Great, that's going to greatly simplify and lower the overhead of
the remote tick restart I'm doing on wake up for the nohz task thing.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists