[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8988.1414190491@famine>
Date: Fri, 24 Oct 2014 15:41:31 -0700
From: Jay Vosburgh <jay.vosburgh@...onical.com>
To: paulmck@...ux.vnet.ibm.com
cc: Yanko Kaneti <yaneti@...lera.com>,
Josh Boyer <jwboyer@...oraproject.org>,
"Eric W. Biederman" <ebiederm@...ssion.com>,
Cong Wang <cwang@...pensource.com>,
Kevin Fenzi <kevin@...ye.com>, netdev <netdev@...r.kernel.org>,
"Linux-Kernel@...r. Kernel. Org" <linux-kernel@...r.kernel.org>,
mroos@...ux.ee, tj@...nel.org
Subject: Re: localed stuck in recent 3.18 git in copy_net_ns?
Paul E. McKenney <paulmck@...ux.vnet.ibm.com> wrote:
>On Fri, Oct 24, 2014 at 03:02:04PM -0700, Jay Vosburgh wrote:
>> Paul E. McKenney <paulmck@...ux.vnet.ibm.com> wrote:
>>
[...]
>> I've got an ftrace capture from unmodified -net, it looks like
>> this:
>>
>> ovs-vswitchd-902 [000] .... 471.778441: rcu_barrier: rcu_sched Begin cpu -1 remaining 0 # 0
>> ovs-vswitchd-902 [000] .... 471.778452: rcu_barrier: rcu_sched Check cpu -1 remaining 0 # 0
>> ovs-vswitchd-902 [000] .... 471.778452: rcu_barrier: rcu_sched Inc1 cpu -1 remaining 0 # 1
>> ovs-vswitchd-902 [000] .... 471.778453: rcu_barrier: rcu_sched OnlineNoCB cpu 0 remaining 1 # 1
>> ovs-vswitchd-902 [000] .... 471.778453: rcu_barrier: rcu_sched OnlineNoCB cpu 1 remaining 2 # 1
>> ovs-vswitchd-902 [000] .... 471.778453: rcu_barrier: rcu_sched OnlineNoCB cpu 2 remaining 3 # 1
>> ovs-vswitchd-902 [000] .... 471.778454: rcu_barrier: rcu_sched OnlineNoCB cpu 3 remaining 4 # 1
>
>OK, so it looks like your system has four CPUs, and rcu_barrier() placed
>callbacks on them all.
No, the system has only two CPUs. It's an Intel Core 2 Duo
E8400, and /proc/cpuinfo agrees that there are only 2. There is a
potentially relevant-sounding message early in dmesg that says:
[ 0.000000] smpboot: Allowing 4 CPUs, 2 hotplug CPUs
>> ovs-vswitchd-902 [000] .... 471.778454: rcu_barrier: rcu_sched Inc2 cpu -1 remaining 4 # 2
>
>The above removes the extra count used to avoid races between posting new
>callbacks and completion of previously posted callbacks.
>
>> rcuos/0-9 [000] ..s. 471.793150: rcu_barrier: rcu_sched CB cpu -1 remaining 3 # 2
>> rcuos/1-18 [001] ..s. 471.793308: rcu_barrier: rcu_sched CB cpu -1 remaining 2 # 2
>
>Two of the four callbacks fired, but the other two appear to be AWOL.
>And rcu_barrier() won't return until they all fire.
>
>> I let it sit through several "hung task" cycles but that was all
>> there was for rcu:rcu_barrier.
>>
>> I should have ftrace with the patch as soon as the kernel is
>> done building, then I can try the below patch (I'll start it building
>> now).
>
>Sounds very good, looking forward to hearing of the results.
Going to bounce it for ftrace now, but the cpu count mismatch
seemed important enough to mention separately.
-J
---
-Jay Vosburgh, jay.vosburgh@...onical.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists