[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130125104025.GA14978@linux.vnet.ibm.com>
Date: Fri, 25 Jan 2013 16:10:25 +0530
From: Raghavendra K T <raghavendra.kt@...ux.vnet.ibm.com>
To: Ingo Molnar <mingo@...nel.org>
Cc: Raghavendra K T <raghavendra.kt@...ux.vnet.ibm.com>,
Peter Zijlstra <peterz@...radead.org>,
Avi Kivity <avi.kivity@...il.com>,
"H. Peter Anvin" <hpa@...or.com>,
Thomas Gleixner <tglx@...utronix.de>,
Gleb Natapov <gleb@...hat.com>, Ingo Molnar <mingo@...hat.com>,
Marcelo Tosatti <mtosatti@...hat.com>,
Rik van Riel <riel@...hat.com>,
Srikar <srikar@...ux.vnet.ibm.com>,
"Nikunj A. Dadhania" <nikunj@...ux.vnet.ibm.com>,
KVM <kvm@...r.kernel.org>, Jiannan Ouyang <ouyang@...pitt.edu>,
Chegu Vinod <chegu_vinod@...com>,
"Andrew M. Theurer" <habanero@...ux.vnet.ibm.com>,
LKML <linux-kernel@...r.kernel.org>,
Srivatsa Vaddagiri <srivatsa.vaddagiri@...il.com>,
Andrew Jones <drjones@...hat.com>
Subject: Re: [PATCH V3 RESEND RFC 1/2] sched: Bail out of yield_to when
source and target runqueue has one task
* Ingo Molnar <mingo@...nel.org> [2013-01-24 11:32:13]:
>
> * Raghavendra K T <raghavendra.kt@...ux.vnet.ibm.com> wrote:
>
> > From: Peter Zijlstra <peterz@...radead.org>
> >
> > In case of undercomitted scenarios, especially in large guests
> > yield_to overhead is significantly high. when run queue length of
> > source and target is one, take an opportunity to bail out and return
> > -ESRCH. This return condition can be further exploited to quickly come
> > out of PLE handler.
> >
> > (History: Raghavendra initially worked on break out of kvm ple handler upon
> > seeing source runqueue length = 1, but it had to export rq length).
> > Peter came up with the elegant idea of return -ESRCH in scheduler core.
> >
> > Signed-off-by: Peter Zijlstra <peterz@...radead.org>
> > Raghavendra, Checking the rq length of target vcpu condition added.(thanks Avi)
> > Reviewed-by: Srikar Dronamraju <srikar@...ux.vnet.ibm.com>
> > Signed-off-by: Raghavendra K T <raghavendra.kt@...ux.vnet.ibm.com>
> > Acked-by: Andrew Jones <drjones@...hat.com>
> > Tested-by: Chegu Vinod <chegu_vinod@...com>
> > ---
> >
> > kernel/sched/core.c | 25 +++++++++++++++++++------
> > 1 file changed, 19 insertions(+), 6 deletions(-)
> >
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index 2d8927f..fc219a5 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -4289,7 +4289,10 @@ EXPORT_SYMBOL(yield);
> > * It's the caller's job to ensure that the target task struct
> > * can't go away on us before we can do any checks.
> > *
> > - * Returns true if we indeed boosted the target task.
> > + * Returns:
> > + * true (>0) if we indeed boosted the target task.
> > + * false (0) if we failed to boost the target.
> > + * -ESRCH if there's no task to yield to.
> > */
> > bool __sched yield_to(struct task_struct *p, bool preempt)
> > {
> > @@ -4303,6 +4306,15 @@ bool __sched yield_to(struct task_struct *p, bool preempt)
> >
> > again:
> > p_rq = task_rq(p);
> > + /*
> > + * If we're the only runnable task on the rq and target rq also
> > + * has only one task, there's absolutely no point in yielding.
> > + */
> > + if (rq->nr_running == 1 && p_rq->nr_running == 1) {
> > + yielded = -ESRCH;
> > + goto out_irq;
> > + }
>
> Looks good to me in principle.
>
> Would be nice to get more consistent benchmark numbers. Once
> those are unambiguously showing that this is a win:
>
> Acked-by: Ingo Molnar <mingo@...nel.org>
>
I ran the test with kernbench and sysbench again on 32 core mx3850
machine with 32 vcpu guests. Results shows definite improvements.
ebizzy and dbench show similar improvement for 1x overcommit
(note that stdev for 1x in dbench is lesser improvemet is now seen at
only 20%)
[ all the experiments are taken out of 8 run averages ].
The patches benefit large guest undercommit scenarios, so I believe
with large guest performance improvemnt is even significant. [ Chegu
Vinod results show performance near to no ple cases ]. Unfortunately I
do not have a machine to test larger guest (>32).
Ingo, Please let me know if this is okay to you.
base kernel = 3.8.0-rc4
+-----------+-----------+-----------+------------+-----------+
kernbench (time in sec lower is better)
+-----------+-----------+-----------+------------+-----------+
base stdev patched stdev %improve
+-----------+-----------+-----------+------------+-----------+
1x 46.6028 1.8672 42.4494 1.1390 8.91234
2x 99.9074 9.1859 90.4050 2.6131 9.51121
+-----------+-----------+-----------+------------+-----------+
+-----------+-----------+-----------+------------+-----------+
sysbench (time in sec lower is better)
+-----------+-----------+-----------+------------+-----------+
base stdev patched stdev %improve
+-----------+-----------+-----------+------------+-----------+
1x 18.7402 0.3764 17.7431 0.3589 5.32065
2x 13.2238 0.1935 13.0096 0.3152 1.61981
+-----------+-----------+-----------+------------+-----------+
+-----------+-----------+-----------+------------+-----------+
ebizzy (records/sec higher is better)
+-----------+-----------+-----------+------------+-----------+
base stdev patched stdev %improve
+-----------+-----------+-----------+------------+-----------+
1x 2421.9000 19.1801 5883.1000 112.7243 142.91259
+-----------+-----------+-----------+------------+-----------+
+-----------+-----------+-----------+------------+-----------+
dbench (throughput MB/sec higher is better)
+-----------+-----------+-----------+------------+-----------+
base stdev patched stdev %improve
+-----------+-----------+-----------+------------+-----------+
1x 11675.9900 857.4154 14103.5000 215.8425 20.79061
+-----------+-----------+-----------+------------+-----------+
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists