linux-kernel - Re: CFS scheduler unfairly prefers pinned tasks

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1444314345.3565.78.camel@gmail.com>
Date:	Thu, 08 Oct 2015 16:25:45 +0200
From:	Mike Galbraith <umgwanakikbuti@...il.com>
To:	paul.szabo@...ney.edu.au
Cc:	peterz@...radead.org, linux-kernel@...r.kernel.org
Subject: Re: CFS scheduler unfairly prefers pinned tasks

On Thu, 2015-10-08 at 21:54 +1100, paul.szabo@...ney.edu.au wrote:
> Dear Mike,
> 
> > I see a fairness issue ... but one opposite to your complaint.
> 
> Why is that opposite? I think it would be fair for the one pert process
> to get 100% CPU, the many oink processes can get everything else. That
> one oink is lowly 10% (when others are 100%) is of no consequence.

Well, not exactly opposite, only opposite in that the one pert task also
receives MORE than it's fair share when unpinned.  Two 100$ hogs sharing
one CPU should each get 50% of that CPU.  The fact that the oink group
contains 8 tasks vs 1 for the pert group should be irrelevant, but what
that last oinker is getting is 1/9 of a CPU, and there just happen to be
9 runnable tasks total, 1 in group pert, and 8 in group oink.

IFF that ratio were to prove to be a constant, AND the oink group were a
massively parallel and synchronized compute job on a huge box, that
entire compute job would not be slowed down by the factor 2 that a fair
distribution would do to it, on say a 1000 core box, it'd be.. utterly
dead, because you'd put it out of your misery.

vogelweide:~/:[0]# cgexec -g cpu:foo bash
vogelweide:~/:[0]# for i in `seq 0 63`; do taskset -c $i cpuhog& done
[1] 8025
[2] 8026
...
vogelweide:~/:[130]# cgexec -g cpu:bar bash
vogelweide:~/:[130]# taskset -c 63 pert 10 (report every 10 seconds)
2260.91 MHZ CPU
perturbation threshold 0.024 usecs.
pert/s:      255 >2070.76us:       38 min:  0.05 max:4065.46 avg: 93.83 sum/s: 23946us overhead: 2.39%
pert/s:      255 >2070.32us:       37 min:  1.32 max:4039.94 avg: 92.82 sum/s: 23744us overhead: 2.37%
pert/s:      253 >2069.85us:       38 min:  0.05 max:4036.44 avg: 94.89 sum/s: 24054us overhead: 2.41%

Hm, that's a kinda odd looking number from my 64 core box, but whatever,
it's far from fair according to my definition thereof.  Poor little oink
plus all other cycles not spent in pert's tight loop add up ~24ms/s.

> Good to see that you agree on the fairness issue... it MUST be fixed!
> CFS might be wrong or wasteful, but never unfair.

Weeell, we've disagreed on pretty much everything we've talked about so
far, but I can well imagine that what I see in the share update business
_could_ be part of your massive compute job woes.

	-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/