lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 24 Oct 2014 15:41:31 -0700
From:	Jay Vosburgh <jay.vosburgh@...onical.com>
To:	paulmck@...ux.vnet.ibm.com
cc:	Yanko Kaneti <yaneti@...lera.com>,
	Josh Boyer <jwboyer@...oraproject.org>,
	"Eric W. Biederman" <ebiederm@...ssion.com>,
	Cong Wang <cwang@...pensource.com>,
	Kevin Fenzi <kevin@...ye.com>, netdev <netdev@...r.kernel.org>,
	"Linux-Kernel@...r. Kernel. Org" <linux-kernel@...r.kernel.org>,
	mroos@...ux.ee, tj@...nel.org
Subject: Re: localed stuck in recent 3.18 git in copy_net_ns?

Paul E. McKenney <paulmck@...ux.vnet.ibm.com> wrote:

>On Fri, Oct 24, 2014 at 03:02:04PM -0700, Jay Vosburgh wrote:
>> Paul E. McKenney <paulmck@...ux.vnet.ibm.com> wrote:
>> 
[...]
>> 	I've got an ftrace capture from unmodified -net, it looks like
>> this:
>> 
>>     ovs-vswitchd-902   [000] ....   471.778441: rcu_barrier: rcu_sched Begin cpu -1 remaining 0 # 0
>>     ovs-vswitchd-902   [000] ....   471.778452: rcu_barrier: rcu_sched Check cpu -1 remaining 0 # 0
>>     ovs-vswitchd-902   [000] ....   471.778452: rcu_barrier: rcu_sched Inc1 cpu -1 remaining 0 # 1
>>     ovs-vswitchd-902   [000] ....   471.778453: rcu_barrier: rcu_sched OnlineNoCB cpu 0 remaining 1 # 1
>>     ovs-vswitchd-902   [000] ....   471.778453: rcu_barrier: rcu_sched OnlineNoCB cpu 1 remaining 2 # 1
>>     ovs-vswitchd-902   [000] ....   471.778453: rcu_barrier: rcu_sched OnlineNoCB cpu 2 remaining 3 # 1
>>     ovs-vswitchd-902   [000] ....   471.778454: rcu_barrier: rcu_sched OnlineNoCB cpu 3 remaining 4 # 1
>
>OK, so it looks like your system has four CPUs, and rcu_barrier() placed
>callbacks on them all.

	No, the system has only two CPUs.  It's an Intel Core 2 Duo
E8400, and /proc/cpuinfo agrees that there are only 2.  There is a
potentially relevant-sounding message early in dmesg that says:

[    0.000000] smpboot: Allowing 4 CPUs, 2 hotplug CPUs

>>     ovs-vswitchd-902   [000] ....   471.778454: rcu_barrier: rcu_sched Inc2 cpu -1 remaining 4 # 2
>
>The above removes the extra count used to avoid races between posting new
>callbacks and completion of previously posted callbacks.
>
>>          rcuos/0-9     [000] ..s.   471.793150: rcu_barrier: rcu_sched CB cpu -1 remaining 3 # 2
>>          rcuos/1-18    [001] ..s.   471.793308: rcu_barrier: rcu_sched CB cpu -1 remaining 2 # 2
>
>Two of the four callbacks fired, but the other two appear to be AWOL.
>And rcu_barrier() won't return until they all fire.
>
>> 	I let it sit through several "hung task" cycles but that was all
>> there was for rcu:rcu_barrier.
>> 
>> 	I should have ftrace with the patch as soon as the kernel is
>> done building, then I can try the below patch (I'll start it building
>> now).
>
>Sounds very good, looking forward to hearing of the results.

	Going to bounce it for ftrace now, but the cpu count mismatch
seemed important enough to mention separately.

	-J

---
	-Jay Vosburgh, jay.vosburgh@...onical.com
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ