lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5318A30B.4020702@oracle.com>
Date:	Thu, 06 Mar 2014 09:32:11 -0700
From:	Khalid Aziz <khalid.aziz@...cle.com>
To:	Thomas Gleixner <tglx@...utronix.de>
CC:	Peter Zijlstra <peterz@...radead.org>,
	Andi Kleen <andi@...stfloor.org>,
	One Thousand Gnomes <gnomes@...rguk.ukuu.org.uk>,
	"H. Peter Anvin" <hpa@...or.com>, Ingo Molnar <mingo@...nel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Al Viro <viro@...iv.linux.org.uk>,
	Oleg Nesterov <oleg@...hat.com>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [RFC] [PATCH] Pre-emption control for userspace

On 03/06/2014 04:14 AM, Thomas Gleixner wrote:
> We understand that you want to avoid preemption in the first place and
> not getting into the contention handling case.
>
> But, what you're trying to do is essentially creating an ABI which we
> have to support and maintain forever. And that definitely is worth a
> few serious questions.

Fair enough. I agree a new ABI should not be created lightly.

>
> Lets ignore the mm related issues for now as those can be solved. That's
> the least of my worries.
>
> Right now you are using this for a single use case with a well defined
> environment, where all related threads reside in the same scheduling
> class (FAIR). But that's one of a gazillion of use cases of Linux.
>

Creating a new ABI for a single use case or a special case is something 
I would argue against as well. I am with you on that. I am stating that 
databases and JVM happen to be two real world examples of the scenario 
where CFS can cause convoying problem inadvertently for a well designed 
critical section that represents a small portion of overall execution 
thread, simply because of where in the current timeslice the critical 
section is hit. If there are other examples others have come across, I 
would love to hear it. If we can indeed say this is a very special case 
for an uncommon workload, I would completely agree with refusing to 
create a new ABI.

> If we allow you to special case your database workload then we have no
> argument why we should not do the same thing for realtime workloads
> where the SCHED_FAIR housekeeping thread can hold a lock shortly to
> access some important data in the SCHED_FIFO realtime computation
> thread. Of course the RT people want to avoid the lock contention as
> much as you do, just for different reasons.
>
> Add SCHED_EDF, cgroups and hierarchical scheduling to the picture and
> hell breaks lose.

Realtime and deadline scheduler policies are supposed to be higher 
priority than CFS. A thread running in CFS that can impact threads 
running with realtime policies is a bad thing, agreed? What I am 
proposing actually allows a thread running with CFS to get out of the 
way of threads running with realtime policies quicker. In your specific 
example, the SCHED_FAIR housekeeping thread gets a chance to get out of 
SCHED_FIFO threads' way by giving its critical section better chance to 
complete execution before causing a convoy problem and while its cache 
is hot by using the exact same mechanism I am proposing. The logic is 
not onerous. Thread asks for amnesty from one context switch if and only 
if rescheduling point happens in the middle of its timeslice. If 
rescheduling point does not occur during its critical section, the 
thread takes that request back and life goes on as if nothing changed. 
If rescheduling point happens in the middle of thread's critical 
section, it gets the amnesty but it yields the processor as soon as it 
is done with its critical section. Any thread that does not play nice 
gets penalized next time it wants immunity (as hpa suggested).

Thanks,
Khalid

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ