lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <46A66393.5000705@redhat.com>
Date:	Tue, 24 Jul 2007 16:39:47 -0400
From:	Chris Snook <csnook@...hat.com>
To:	Chris Friesen <cfriesen@...tel.com>
CC:	Tong Li <tong.n.li@...el.com>, mingo@...e.hu,
	linux-kernel@...r.kernel.org
Subject: Re: [RFC] scheduler: improve SMP fairness in CFS

Chris Friesen wrote:
> Chris Snook wrote:
> 
>> Concerns aside, I agree that fairness is important, and I'd really 
>> like to see a test case that demonstrates the problem.
> 
> One place that might be useful is the case of fairness between resource 
> groups, where the load balancer needs to consider each group separately.

You mean like the CFS group scheduler patches?  I don't see how this patch is 
related to that, besides working on top of it.

> Now it may be the case that trying to keep the load of each class within 
> X% of the other cpus is sufficient, but it's not trivial.

I agree.  My suggestion is that we try being fair from the bottom-up, rather 
than top-down.  If most of the rebalancing is local, we can minimize expensive 
locking and cross-node migrations, and scale very nicely on large NUMA boxes.

> Consider the case where you have a resource group that is allocated 50% 
> of each cpu in a dual cpu system, and only have a single task in that 
> group.  This means that in order to make use of the full group 
> allocation, that task needs to be load-balanced to the other cpu as soon 
> as it gets scheduled out.  Most load-balancers can't handle that kind of 
> granularity, but I have guys in our engineering team that would really 
> like this level of performance.

Divining the intentions of the administrator is an AI-complete problem and we're 
not going to try to solve that in the kernel.  An intelligent administrator 
could also allocate 50% of each CPU to a resource group containing all the 
*other* processes.  Then, when the other processes are scheduled out, your 
single task will run on whichever CPU is idle.  This will very quickly 
equilibrate to the scheduling ping-pong you seem to want.  The scheduler 
deliberately avoids this kind of migration by default because it hurts cache and 
TLB performance, so if you want to override this very sane default behavior, 
you're going to have to explicitly configure it yourself.

> We currently use CKRM on an SMP machine, but the only way we can get 
> away with it is because our main app is affined to one cpu and just 
> about everything else is affined to the other.

If you're not explicitly allocating resources, you're just low-latency, not 
truly realtime.  Realtime requires guaranteed resources, so messing with 
affinities is a necessary evil.

> We have another SMP box that would benefit from group scheduling, but we 
> can't use it because the load balancer is not nearly good enough.

Which scheduler?  Have you tried the CFS group scheduler patches?

	-- Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ