linux-kernel - Re: [report] renicing X, cfs-v5 vs sd-0.46

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070423194821.GA10286@elte.hu>
Date:	Mon, 23 Apr 2007 21:48:21 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Juliusz Chroboczek <jch@....jussieu.fr>,
	Con Kolivas <kernel@...ivas.org>, ck list <ck@....kolivas.org>,
	Bill Davidsen <davidsen@....com>, Willy Tarreau <w@....eu>,
	William Lee Irwin III <wli@...omorphy.com>,
	linux-kernel@...r.kernel.org,
	Andrew Morton <akpm@...ux-foundation.org>,
	Nick Piggin <npiggin@...e.de>, Mike Galbraith <efault@....de>,
	Arjan van de Ven <arjan@...radead.org>,
	Peter Williams <pwil3058@...pond.net.au>,
	Thomas Gleixner <tglx@...utronix.de>, caglar@...dus.org.tr,
	Gene Heskett <gene.heskett@...il.com>
Subject: Re: [report] renicing X, cfs-v5 vs sd-0.46


* Linus Torvalds <torvalds@...ux-foundation.org> wrote:

> >  4  0      0 475752  13492 176320    0    0     0     0  107 1477 85 15  0  0  0
> >  4  0      0 475752  13492 176320    0    0     0     0  122 1498 84 16  0  0  0
> 
> Did you even *look* at your own numbers? Maybe you looked at 
> "interrpts". The context switch numbers go from 170 per second, to 
> 1500 per second!

i think i managed to look at the correct column :) 1500 per second is 
the absolute ceiling under CFS.

but, even though this utterly ugly hack of renicing (Arjan immediately 
slapped me for it when i mentioned it to him and he correctly predicted 
that lkml would go amok on anything like this) undeniably behaves better 
under CFS and gives a _visually better_ desktop at a 1500 context 
switches per second, i share your unease about it on architectural and 
policy grounds. Doing this hack upstream could easily hinder the 
efficient creation of a healthy economy for "scheduler money", by 
forcibly hacking X out of the picture - while X could be such a nice 
(and important) prototype for a cool and useful new scheduling 
infrastructure.

Basically this hack is bad on policy grounds because it is giving X an 
"legislated, unfair monopoly" on the system. It's the equivalent of a 
state-guaranteed monopoly in certain 'strategic industries'. It has some 
advantages but it is very much net harmful. Most of the time the 
"strategic importance" of any industry can be cleanly driven by the 
normal mechanics of supply and demand: anything important is recognized 
by 'people' as important via actual actions of giving it 'money'. (This 
approach also gives formerly-strategic industries the boot quickly, were 
they to become less strategic to people as things evolve.)

still, recognizing all the very real advantages of a cleaner approach, 
my primary present goal with CFS is to reach "maximum interactivity" 
here and today on a maximimally broad set of workloads, whatever it 
takes, and then to look back and figure out cleaner ways while still 
carefully keeping that maximum interactivity propertly of CFS.

For this particular auto-renicing hack here are the observed objective 
advantages to the user:

 1) while it's still an ugly hack, the increased context-switching rate
    (surprisingly to me!) still has actual, objective, undeniable 
    positive effects even in this totally X-centric worst-case messaging 
    scenario i tried to trigger:

        - visibly better eye-pleasing X behavior under the same
          "performance of scrolling"

        - no hung mouse pointer. Ever. I'd not go as far as Windows to 
          put the mouse refresh code into the kernel, but now having 
          experienced under CFS the 'mouse never hangs under any load' 
          phenomenon for a longer time, i have to admit i got addicted 
          to it. It give instant emotionally positive feedback about 
          "yes, your system is still fine, just overworked a bit", and 
          it also gives a "you caused something to happen on this box, 
          cool boy!" reassurance to the impatient human who is waiting 
          on it - be it that such a minimal thing as a moving mouse 
          pointer.

    There's a new argument as well, not amongst the issues i raised 
    before: people are happily spending 40-50% of their CPU's
    power on Beryl just to get a more ergonomic desktop via 3D effects, 
    so why not allow them to achieve another type of visual ergonomy by 
    allowing an increased, maximum-throttled X context-switch rate,
    without any measurable drop in performance, to a tunable maximum? 
    I can see no easy way for X itself to control this context-switching
    "refresh" rate in a sane way, as its workload is largely detached
    from client workloads and there's no communication between clients.

 2) it's the absolute worst maximum rate you'll ever see under CFS, and
    i definitely concentrated on triggering the worst-case. On other
    schedulers i easily got to 14K context-switches per second or worse,
    depending on the X workload, which hurts performance and makes it 
    behave visually worse. On CFS the 1400 context-switches is the 
    _ceiling_, did not measurably hurt performance and it is tunable 
    ceiling.

 3) this behavior was totally uncontrollable on other schedulers i tried
    and indeed has hurt performance there. On CFS this is still totally
    tunable and controllable on several levels.

i'm not saying that any of this reduces the ugliness of the hack, or 
that any of this makes the strategic disadvantages of this hack 
disappear, i simply tried to point out that despite the existing 
conventional wisdom it's apparently much more useful in practice on CFS 
than on other schedulers.

And if the "economy of scheduling" experiment fails in practice for some 
presently unknown technological reason, we might as well have to go back 
to ugly tricks like this one. With its 5 lines and limited scope i think 
it still beats 500 lines of convoluted scheduling heuristics :-/ Right 
now i'm very positive about the "economy of scheduling" angle, i think 
we have a realistic chance to pull it off.

	Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/