linux-kernel - Re: [PATCH 6/6] sched: disabled rt-bandwidth by default

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20080828105408.GA4488@elte.hu>
Date:	Thu, 28 Aug 2008 12:54:08 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	Nick Piggin <nickpiggin@...oo.com.au>
Cc:	Andi Kleen <andi@...stfloor.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	linux-kernel@...r.kernel.org,
	Stefani Seibold <stefani@...bold.net>,
	Dario Faggioli <raistlin@...ux.it>,
	Max Krasnyansky <maxk@...lcomm.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [PATCH 6/6] sched: disabled rt-bandwidth by default

* Nick Piggin <nickpiggin@...oo.com.au> wrote:

> On Wednesday 27 August 2008 08:49, Andi Kleen wrote:
> > Thomas Gleixner <tglx@...utronix.de> writes:
> > > Well, we might have a public opinion poll, whether a system is
> > > declared frozen after 1, 10 or 100 seconds. Even a one second
> > > unresponsivness shows up on the kernel bugzilla and you request that
> > > unlimited unresponsivness w/o a chance to debug it is the sane
> > > default.
> >
> > That assumes single CPU. With multiple CPUs and not
> > all hogged the system should be still responsive?
> 
> Right.

Wrong.

Even if the system has multiple CPUs, and even if just a single CPU is 
fully utilized by an RT task, without the rt-limit the system will still 
lock up in practice due to various other factors: workqueues and tasks 
being 'stuck' on CPUs that host an RT hog. While there's obviously CPU 
time available on other CPUs, you cannot run 'top', the desktop will 
freeze, work flows of the system can be stuck, etc, etc..

With the rt limit in place, it's all pretty smooth and debuggable. Even 
with all CPUs hogged by SCHED_FIFO prio 99 the system is laggy but 
debuggable - the user can run 'top' and can resolve the situation.

Really, this reply of yours shows something startling: that despite this 
many mails you still have never actually tried to run the scenario you 
are complaining about: you have never tried to run a CPU hog high-prio 
RT task on a Linux system before, and you have never observed the 
effects it has on general system stability and debuggability.

This fundamental lack of experience weakens all your arguments and i 
dont even know why you are arguing about it. Do you perhaps have some 
customer application/workload you are worried about? If you have then 
please tell us about the exact specifics - this handwaving about 
compliance really makes little sense.

In other words: in our car the air-bag continues to be enabled by 
default, and if someone wants to use the car for stunts the air-bag can 
be disabled via that handy sysctl.

In any case i think i'm going to ignore this thread from now on, nothing 
new has been said really, just the general tone of discussion is 
deteriorating. You are also very late with raising objections in any 
case - the rt-limit feature has been posted 10 months ago and went 
upstream 8 months ago - two full kernel cycles have been completed with 
this change in place and a third one has almost been finished.

        Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/