lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sun, 13 Jul 2008 10:46:59 -0700 (PDT)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Dmitry Adamushko <dmitry.adamushko@...il.com>
cc:	Vegard Nossum <vegard.nossum@...il.com>,
	Paul Menage <menage@...gle.com>,
	Max Krasnyansky <maxk@...lcomm.com>, Paul Jackson <pj@....com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>, miaox@...fujitsu.com,
	rostedt@...dmis.org, Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...e.hu>,
	Linux Kernel <linux-kernel@...r.kernel.org>
Subject: Re: current linux-2.6.git: cpusets completely broken



On Sun, 13 Jul 2008, Linus Torvalds wrote:
> 
> The thing is, we should fix the top level code to never even _consider_ an 
> invalid CPU as a target, and that in turn should mean that all the other 
> code should be able to just totally ignore CPU hotplug events.

IOW, I think we should totally remove the whole "update_sched_domains()" 
thing too. Any logic that needs it is broken. We shouldn't detach the 
scheduler domains in DOWN_PREPARE (much less UP_PREPARE), we should just 
leave them damn well alone.

As the comment says, "The domains and groups cannot be updated in place 
without racing with the balancing code". The thing is, we shouldn't even 
try. The correct way to handle all this is to make the balancing code use 
the domains regardless, but protect against CPU's going down with 
_another_ data structure that is much easier to update.

Namely something like 'cpu_active_map'.

Then we just get rid of all the crap in update_sched_domains() entirely, 
and then we can make the cpusets code do the *sane* thing, which is to 
rebuild the scheduler domains only when the CPU up/down has completed.

So instead of this illogical and crazy mess:

	+       switch (phase) {
	+       case CPU_UP_CANCELED:
	+       case CPU_UP_CANCELED_FROZEN:
	+       case CPU_DOWN_FAILED:
	+       case CPU_DOWN_FAILED_FROZEN:
	+       case CPU_ONLINE:
	+       case CPU_ONLINE_FROZEN:
	+       case CPU_DEAD:
	+       case CPU_DEAD_FROZEN:
	+               common_cpu_mem_hotplug_unplug(1);

it should just say

	+       switch (phase) {
	+       case CPU_ONLINE:
	+       case CPU_ONLINE_FROZEN:
	+       case CPU_DEAD:
	+       case CPU_DEAD_FROZEN:
	+               common_cpu_mem_hotplug_unplug(1);

because it only makes sense to rebuild the scheduler domains when the 
thing SUCCEEDS. 

See? By having a sane design, the code is not just more robust and easy to 
follow, you can also simplify it and make it more logical.

The current design is not sane.

			Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ