lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <15891.1414255096@famine>
Date:	Sat, 25 Oct 2014 09:38:16 -0700
From:	Jay Vosburgh <jay.vosburgh@...onical.com>
To:	paulmck@...ux.vnet.ibm.com
cc:	Yanko Kaneti <yaneti@...lera.com>,
	Josh Boyer <jwboyer@...oraproject.org>,
	"Eric W. Biederman" <ebiederm@...ssion.com>,
	Cong Wang <cwang@...pensource.com>,
	Kevin Fenzi <kevin@...ye.com>, netdev <netdev@...r.kernel.org>,
	"Linux-Kernel@...r. Kernel. Org" <linux-kernel@...r.kernel.org>,
	mroos@...ux.ee, tj@...nel.org
Subject: Re: localed stuck in recent 3.18 git in copy_net_ns?

Paul E. McKenney <paulmck@...ux.vnet.ibm.com> wrote:

>On Fri, Oct 24, 2014 at 09:33:33PM -0700, Jay Vosburgh wrote:
>> 	Looking at the dmesg, the early boot messages seem to be
>> confused as to how many CPUs there are, e.g.,
>> 
>> [    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
>> [    0.000000] Hierarchical RCU implementation.
>> [    0.000000]  RCU debugfs-based tracing is enabled.
>> [    0.000000]  RCU dyntick-idle grace-period acceleration is enabled.
>> [    0.000000]  RCU restricting CPUs from NR_CPUS=256 to nr_cpu_ids=4.
>> [    0.000000] RCU: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=4
>> [    0.000000] NR_IRQS:16640 nr_irqs:456 0
>> [    0.000000]  Offload RCU callbacks from all CPUs
>> [    0.000000]  Offload RCU callbacks from CPUs: 0-3.
>> 
>> 	but later shows 2:
>> 
>> [    0.233703] x86: Booting SMP configuration:
>> [    0.236003] .... node  #0, CPUs:      #1
>> [    0.255528] x86: Booted up 1 node, 2 CPUs
>> 
>> 	In any event, the E8400 is a 2 core CPU with no hyperthreading.
>
>Well, this might explain some of the difficulties.  If RCU decides to wait
>on CPUs that don't exist, we will of course get a hang.  And rcu_barrier()
>was definitely expecting four CPUs.
>
>So what happens if you boot with maxcpus=2?  (Or build with
>CONFIG_NR_CPUS=2.) I suspect that this might avoid the hang.  If so,
>I might have some ideas for a real fix.

	Booting with maxcpus=2 makes no difference (the dmesg output is
the same).

	Rebuilding with CONFIG_NR_CPUS=2 makes the problem go away, and
dmesg has different CPU information at boot:

[    0.000000] smpboot: 4 Processors exceeds NR_CPUS limit of 2
[    0.000000] smpboot: Allowing 2 CPUs, 0 hotplug CPUs
 [...]
[    0.000000] setup_percpu: NR_CPUS:2 nr_cpumask_bits:2 nr_cpu_ids:2 nr_node_ids:1
 [...]
[    0.000000] Hierarchical RCU implementation.
[    0.000000] 	RCU debugfs-based tracing is enabled.
[    0.000000] 	RCU dyntick-idle grace-period acceleration is enabled.
[    0.000000] NR_IRQS:4352 nr_irqs:440 0
[    0.000000] 	Offload RCU callbacks from all CPUs
[    0.000000] 	Offload RCU callbacks from CPUs: 0-1.

	-J

---
	-Jay Vosburgh, jay.vosburgh@...onical.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ