linux-kernel - Re: sched_yield: delete sysctl_sched_compat

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <1196397129.25646.78.camel@ymzhang>
Date:	Fri, 30 Nov 2007 12:32:09 +0800
From:	"Zhang, Yanmin" <yanmin_zhang@...ux.intel.com>
To:	Nick Piggin <nickpiggin@...oo.com.au>
Cc:	Arjan van de Ven <arjan@...radead.org>,
	Andrew Morton <akpm@...ux-foundation.org>, mingo@...e.hu,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: sched_yield: delete sysctl_sched_compat_yield

On Fri, 2007-11-30 at 14:29 +1100, Nick Piggin wrote:
> On Friday 30 November 2007 14:15, Zhang, Yanmin wrote:
> > On Fri, 2007-11-30 at 13:46 +1100, Nick Piggin wrote:
> > > On Wednesday 28 November 2007 09:57, Arjan van de Ven wrote:
> 
> > > > sounds like a bad idea; volanomark (well, technically the jvm behind
> > > > it) is abusing sched_yield() by assuming it does something it really
> > > > doesn't do, and as it happens some of the earlier 2.6 schedulers
> > > > accidentally happened to behave in a way that was nice for this
> > > > benchmark.
> > >
> > > OK, why is this still happening? Haven't we been asking JVMs to use
> > > futexes or posix locking for years and years now? Are there any sane
> > > jvms that _don't_ use yield?
> >
> > I think it's an issue of volanomark (a kind of java application) instead of
> > JVM.
> 
> volanomark itself and not the jvm is calling sched_yield()? Do we have
> any non-toy threaded java apps? (what's JAVA in the kernel-perf tests?)
I run lots of well-known benchmarks and volanoMark is the one who gets the largest
impact from sched_yield.

As for real-applications which use sched_yield, mostly, they are not open sources.
Yesterday, I got to know someone was using sched_yield in his network C programs,
but he didn't want to share the sources with me.

> 
> 
> > > > Todays kernel has a different behavior somewhat (and before people
> > > > scream "regression"; sched_yield() behavior isn't really specified and
> > > > doesn't make any sense at all, whatever you get is what you get....
> > > > it's pretty much an insane defacto behavior that is incredibly tied to
> > > > which decisions the scheduler makes how, and no app can depend on that
> > >
> > > It is a performance regression. Is there any reason *not* to use the
> > > "compat" yield by default?
> >
> > There is no, so I suggest to set sched_compat_yield=1 by default.
> > If sched_compat_yield=0, kernel almost does nothing but returns. When
> > sched_compat_yield=1, it is closer to the meaning of sched_yield man page.
> 
> sched_yield() is really only defined for posix realtime scheduling
> AFAIK, which talks about priority lists. 
> 
> SCHED_OTHER is defined to be a single priority, below the rest of the
> realtime priorities. So at first you *might* say that the process
> should then be made to run only after all other SCHED_OTHER processes,
> however there is no such ordering requirement for SCHED_OTHER
> scheduling. The SCHED_OTHER scheduler can run any task at any time.
> 
> That said, I think people would *expect* that call be much closer to
> the compat behaviour than the current default. And that's definitely
> what Linux has done in the past. So there really does need to be a
> good reason to change it like this IMO.
That's indeed what I am thinking.

I am running many testing(SPECjbb/SPECjbb2005/cpu2000/iozone/dbench/tbench...) to 
see if there is any regression if sched_compat_yield=1. I think there is no
regression and the testing is just to double-check.

> 
> 
> > > As you say, for SCHED_OTHER tasks, yield
> > > can do almost anything. We may as well do something that isn't a
> > > regression...
> >
> > I just found SCHED_OTHER in man sched_setscheduler. Is it SCHED_NORMAL in
> > the latest kernel?
> 
> Yes, SCHED_NORMAL is SCHED_OTHER. Don't know why it got renamed...
Thanks.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/