linux-kernel - Re: [PATCH v2] x86-64, NUMA: fix fakenuma boot failure

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20110414150551.GC21397@mtj.dyndns.org>
Date:	Fri, 15 Apr 2011 00:05:51 +0900
From:	Tejun Heo <tj@...nel.org>
To:	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
Cc:	LKML <linux-kernel@...r.kernel.org>,
	Yinghai Lu <yinghai@...nel.org>,
	Brian Gerst <brgerst@...il.com>,
	Cyrill Gorcunov <gorcunov@...il.com>,
	Shaohui Zheng <shaohui.zheng@...el.com>,
	David Rientjes <rientjes@...gle.com>,
	Ingo Molnar <mingo@...e.hu>,
	"H. Peter Anvin" <hpa@...ux.intel.com>
Subject: Re: [PATCH v2] x86-64, NUMA: fix fakenuma boot failure

Hello,

On Thu, Apr 14, 2011 at 09:51:00AM +0900, KOSAKI Motohiro wrote:
> hmm...  My carbon copy is not corrupted. Maybe crappy intermediate
> server override it ?

Sorry about that.  Problem was on my side.

The patch itself looks good to me now, so,

 Acked-by: Tejun Heo <tj@...nel.org>

but I have some nitpicky comments and it would be nice if you can
respin the patch with the suggested updates.

> Currently, numa=fake boot parameter is broken. If it's used, kernel
> doesn't boot and makes panic by zero divide error.

"kernel may panic due to devide by zero error depending on CPU
configuration"

> The zero divede is caused following line. (ie group->cpu_power==0)
> 
> update_sg_lb_stats()

Maybe it would be a good idea to prefix the above with filename, ie -
"kernel/sched_fail.c::update_sg_lb_stats()"

> This is regression  since commit e23bba6044 (x86-64, NUMA: Unify
> emulated distance mapping). Because It drop fake_physnodes() and
> then cpu-node mapping was changed.

"This is a regression caused by blah blah because it changes cpu ->
node mapping in the process of dropping fake_physnodes()"

> old) all cpus are assinged node 0
> now) cpus are assigned round robin
>      (the logic is implemented by numa_init_array())

It would be nice to note that the above happens only for CPUs which
lack explicit NUMA configuration information.

> Why round robin assignment doesn't work? Because init_numa_sched_groups_power()
> assume all logical cpus in the same physical cpu are assigned the same node.
  ^^^^^^                                           ^^^^^^^^^^^^
  assumes                                            share

> (Then it only account group_first_cpu()). But the simple round robin
                ^^^^^^^                   ^^^^^
              accounts for      probably ", and" would work better here
> broke the above assumption.
  ^^^^^
  breaks

> Thus, this patch implement to reassigne node-id if buggy firmware or numa
> emulation makes wrong cpu node map.

It would be nice if you can detail the solution a bit more.  What it's
doing, which configuration it affects and so on.

> +	/*
> +	 * Our CPU scheduler assume all logical cpus in the same physical cpu
> +	 * package are assigned the same node. But, Buggy ACPI table or NUMA
> +	 * emulation might assign them to different node. Fix it.
> +	*/

Care to make the above a docbook comment?

Thank you.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/