[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1294889169.8089.10.camel@marge.simson.net>
Date: Thu, 13 Jan 2011 04:26:09 +0100
From: Mike Galbraith <efault@....de>
To: Rik van Riel <riel@...hat.com>
Cc: Peter Zijlstra <a.p.zijlstra@...llo.nl>, kvm@...r.kernel.org,
linux-kernel@...r.kernel.org, Avi Kiviti <avi@...hat.com>,
Srivatsa Vaddagiri <vatsa@...ux.vnet.ibm.com>,
Chris Wright <chrisw@...s-sol.org>
Subject: Re: [RFC -v3 PATCH 2/3] sched: add yield_to function
On Wed, 2011-01-12 at 22:02 -0500, Rik van Riel wrote:
> Cgroups only makes the matter worse - libvirt places
> each KVM guest into its own cgroup, so a VCPU will
> generally always be alone on its own per-cgroup, per-cpu
> runqueue! That can lead to pulling a VCPU onto our local
> CPU because we think we are alone, when in reality we
> share the CPU with others...
How can that happen? If the task you're trying to accelerate isn't in
your task group, the whole attempt should be a noop.
> Removing the pulling code allows me to use all 4
> CPUs with a 4-VCPU KVM guest in an uncontended situation.
>
> > + /* Tell the scheduler that we'd really like pse to run next. */
> > + p_cfs_rq->next = pse;
>
> Using set_next_buddy propagates this up to the root,
> allowing the scheduler to actually know who we want to
> run next when cgroups is involved.
>
> > + /* We know whether we want to preempt or not, but are we allowed? */
> > + if (preempt&& same_thread_group(p, task_of(p_cfs_rq->curr)))
> > + resched_task(task_of(p_cfs_rq->curr));
>
> With this in place, we can get into the situation where
> we will gladly give up CPU time, but not actually give
> any to the other VCPUs in our guest.
>
> I believe we can get rid of that test, because pick_next_entity
> already makes sure it ignores ->next if picking ->next would
> lead to unfairness.
Preempting everybody who is in your way isn't playing nice neighbor, so
I think at least the same_thread_group() test needs to stay. But that's
Peter's call. Starting a zillion threads to play wakeup preempt and
lets hog the cpu isn't nice either, but it's allowed.
> Removing this test (and simplifying yield_to_task_fair) seems
> to lead to more predictable test results.
Less is more :)
-Mike
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists