[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1299051643.17065.11.camel@marge.simson.net>
Date: Wed, 02 Mar 2011 08:40:43 +0100
From: Mike Galbraith <efault@....de>
To: Paul Turner <pjt@...gle.com>
Cc: Venkatesh Pallipadi <venki@...gle.com>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...e.hu>, linux-kernel@...r.kernel.org,
Rik van Riel <riel@...hat.com>
Subject: Re: [PATCH] sched: next buddy hint on sleep and preempt path
On Tue, 2011-03-01 at 23:08 -0800, Paul Turner wrote:
> On Tue, Mar 1, 2011 at 10:47 PM, Mike Galbraith <efault@....de> wrote:
> > On Tue, 2011-03-01 at 21:43 -0800, Paul Turner wrote:
> >> On Tue, Mar 1, 2011 at 3:33 PM, Venkatesh Pallipadi <venki@...gle.com> wrote:
> >
> >> > for_each_sched_entity(se) {
> >> > cfs_rq = cfs_rq_of(se);
> >> > dequeue_entity(cfs_rq, se, flags);
> >> >
> >> > /* Don't dequeue parent if it has other entities besides us */
> >> > - if (cfs_rq->load.weight)
> >> > + if (cfs_rq->load.weight) {
> >> > + /*
> >> > + * Bias pick_next to pick a task from this cfs_rq, as
> >> > + * p is sleeping when it is within its sched_slice.
> >> > + */
> >> > + if (task_flags & DEQUEUE_SLEEP && se->parent)
> >> > + set_next_buddy(se->parent);
> >>
> >> re-using the last_buddy would seem like a more natural fit here; also
> >> doesn't have a clobber race with a wakeup
> >
> > Hm, that would break last_buddy no? A preempted task won't get the CPU
> > back after light preempting thread deactivates. (it's disabled atm
> > unless heavily overloaded anyway, but..)
>
> Ommm yeah.. we're actually a little snookered in this case since the
> pre-empting thread's sleep will be voluntary which will try to return
> time to its hierarchy.
>
> I suppose keeping the last_buddy is preferable to the occasional clobber.
Yeah, I think we don't want to break it. I don't know if pgsql still
uses userland spinlocks, haven't run it in quite a while now, but with
those nasty things, last_buddy was the only thing that kept it from
collapsing into a quivering heap when you try to scale. Preempting a
userland spinlock holder gets ugly in the extreme.
I'm going to test this patch some more, but in light testing, I saw no
interactivity problems with it, and it does _seem_ to be improving
throughput when there are competing grouped loads sharing the box. I
haven't tested that heftily though, that's just watching the numbers and
recalling the relative effect of mixing loads previously.
-Mike
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists