lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6050.1414176588@famine>
Date:	Fri, 24 Oct 2014 11:49:48 -0700
From:	Jay Vosburgh <jay.vosburgh@...onical.com>
To:	paulmck@...ux.vnet.ibm.com
cc:	Yanko Kaneti <yaneti@...lera.com>,
	Josh Boyer <jwboyer@...oraproject.org>,
	"Eric W. Biederman" <ebiederm@...ssion.com>,
	Cong Wang <cwang@...pensource.com>,
	Kevin Fenzi <kevin@...ye.com>, netdev <netdev@...r.kernel.org>,
	"Linux-Kernel@...r. Kernel. Org" <linux-kernel@...r.kernel.org>,
	mroos@...ux.ee
Subject: Re: localed stuck in recent 3.18 git in copy_net_ns?

Paul E. McKenney <paulmck@...ux.vnet.ibm.com> wrote:

>On Fri, Oct 24, 2014 at 08:35:26PM +0300, Yanko Kaneti wrote:
>> On Fri-10/24/14-2014 10:20, Paul E. McKenney wrote:
>> > On Fri, Oct 24, 2014 at 08:09:31PM +0300, Yanko Kaneti wrote:
>> > > On Fri-10/24/14-2014 09:54, Paul E. McKenney wrote:
>> > > > On Fri, Oct 24, 2014 at 07:29:43PM +0300, Yanko Kaneti wrote:
>> > > > > On Fri-10/24/14-2014 08:40, Paul E. McKenney wrote:
>> > > > > > On Fri, Oct 24, 2014 at 12:08:57PM +0300, Yanko Kaneti wrote:
>> > > > > > > On Thu-10/23/14-2014 15:04, Paul E. McKenney wrote:
>> > > > > > > > On Fri, Oct 24, 2014 at 12:45:40AM +0300, Yanko Kaneti wrote:
>> > > > > > > > > 
>> > > > > > > > > On Thu, 2014-10-23 at 13:05 -0700, Paul E. McKenney wrote:
>> > > > > > > > > > On Thu, Oct 23, 2014 at 10:51:59PM +0300, Yanko Kaneti wrote:
>> > > > 
>> > > > [ . . . ]
>> > > > 
>> > > > > > > Ok, unless I've messsed up something major, bisecting points to:
>> > > > > > > 
>> > > > > > > 35ce7f29a44a rcu: Create rcuo kthreads only for onlined CPUs
>> > > > > > > 
>> > > > > > > Makes any sense ?
>> > > > > > 
>> > > > > > Good question.  ;-)
>> > > > > > 
>> > > > > > Are any of your online CPUs missing rcuo kthreads?  There should be
>> > > > > > kthreads named rcuos/0, rcuos/1, rcuos/2, and so on for each online CPU.
>> > > > > 
>> > > > > Its a Phenom II X6. With 3.17 and linux-tip with 35ce7f29a44a reverted, the rcuos are 8
>> > > > > and the modprobe ppp_generic testcase reliably works, libvirt also manages
>> > > > > to setup its bridge.
>> > > > > 
>> > > > > Just with linux-tip , the rcuos are 6 but the failure is as reliable as
>> > > > > before.
>> > > 
>> > > > Thank you, very interesting.  Which 6 of the rcuos are present?
>> > > 
>> > > Well, the rcuos are 0 to 5. Which sounds right for a 6 core CPU like this   
>> > > Phenom II.
>> > 
>> > Ah, you get 8 without the patch because it creates them for potential
>> > CPUs as well as real ones.  OK, got it.
>> > 
>> > > > > Awating instructions: :)
>> > > > 
>> > > > Well, I thought I understood the problem until you found that only 6 of
>> > > > the expected 8 rcuos are present with linux-tip without the revert.  ;-)
>> > > > 
>> > > > I am putting together a patch for the part of the problem that I think
>> > > > I understand, of course, but it would help a lot to know which two of
>> > > > the rcuos are missing.  ;-)
>> > > 
>> > > Ready to test
>> > 
>> > Well, if you are feeling aggressive, give the following patch a spin.
>> > I am doing sanity tests on it in the meantime.
>> 
>> Doesn't seem to make a difference here
>
>OK, inspection isn't cutting it, so time for tracing.  Does the system
>respond to user input?  If so, please enable rcu:rcu_barrier ftrace before
>the problem occurs, then dump the trace buffer after the problem occurs.

	My system is up and responsive when the problem occurs, so this
shouldn't be a problem.

	Do you want the ftrace with your patch below, or unmodified tip
of tree?

	-J


>							Thanx, Paul
>
>> > ------------------------------------------------------------------------
>> > 
>> > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
>> > index 29fb23f33c18..927c17b081c7 100644
>> > --- a/kernel/rcu/tree_plugin.h
>> > +++ b/kernel/rcu/tree_plugin.h
>> > @@ -2546,9 +2546,13 @@ static void rcu_spawn_one_nocb_kthread(struct rcu_state *rsp, int cpu)
>> >  			rdp->nocb_leader = rdp_spawn;
>> >  			if (rdp_last && rdp != rdp_spawn)
>> >  				rdp_last->nocb_next_follower = rdp;
>> > -			rdp_last = rdp;
>> > -			rdp = rdp->nocb_next_follower;
>> > -			rdp_last->nocb_next_follower = NULL;
>> > +			if (rdp == rdp_spawn) {
>> > +				rdp = rdp->nocb_next_follower;
>> > +			} else {
>> > +				rdp_last = rdp;
>> > +				rdp = rdp->nocb_next_follower;
>> > +				rdp_last->nocb_next_follower = NULL;
>> > +			}
>> >  		} while (rdp);
>> >  		rdp_spawn->nocb_next_follower = rdp_old_leader;
>> >  	}
>> > 

---
	-Jay Vosburgh, jay.vosburgh@...onical.com
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ