linux-kernel - Re: [RFC -v3 PATCH 2/3] sched: add yield

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1294889169.8089.10.camel@marge.simson.net>
Date:	Thu, 13 Jan 2011 04:26:09 +0100
From:	Mike Galbraith <efault@....de>
To:	Rik van Riel <riel@...hat.com>
Cc:	Peter Zijlstra <a.p.zijlstra@...llo.nl>, kvm@...r.kernel.org,
	linux-kernel@...r.kernel.org, Avi Kiviti <avi@...hat.com>,
	Srivatsa Vaddagiri <vatsa@...ux.vnet.ibm.com>,
	Chris Wright <chrisw@...s-sol.org>
Subject: Re: [RFC -v3 PATCH 2/3] sched: add yield_to function

On Wed, 2011-01-12 at 22:02 -0500, Rik van Riel wrote:

> Cgroups only makes the matter worse - libvirt places
> each KVM guest into its own cgroup, so a VCPU will
> generally always be alone on its own per-cgroup, per-cpu
> runqueue!  That can lead to pulling a VCPU onto our local
> CPU because we think we are alone, when in reality we
> share the CPU with others...

How can that happen?  If the task you're trying to accelerate isn't in
your task group, the whole attempt should be a noop.

> Removing the pulling code allows me to use all 4
> CPUs with a 4-VCPU KVM guest in an uncontended situation.
> 
> > +	/* Tell the scheduler that we'd really like pse to run next. */
> > +	p_cfs_rq->next = pse;
> 
> Using set_next_buddy propagates this up to the root,
> allowing the scheduler to actually know who we want to
> run next when cgroups is involved.
> 
> > +	/* We know whether we want to preempt or not, but are we allowed? */
> > +	if (preempt&&  same_thread_group(p, task_of(p_cfs_rq->curr)))
> > +		resched_task(task_of(p_cfs_rq->curr));
> 
> With this in place, we can get into the situation where
> we will gladly give up CPU time, but not actually give
> any to the other VCPUs in our guest.
> 
> I believe we can get rid of that test, because pick_next_entity
> already makes sure it ignores ->next if picking ->next would
> lead to unfairness.

Preempting everybody who is in your way isn't playing nice neighbor, so
I think at least the same_thread_group() test needs to stay.  But that's
Peter's call.  Starting a zillion threads to play wakeup preempt and
lets hog the cpu isn't nice either, but it's allowed.

> Removing this test (and simplifying yield_to_task_fair) seems
> to lead to more predictable test results.

Less is more :)

	-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/