[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110824224454.GB26417@somewhere.redhat.com>
Date: Thu, 25 Aug 2011 00:45:00 +0200
From: Frederic Weisbecker <fweisbec@...il.com>
To: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Josh Boyer <jwboyer@...hat.com>, linux-kernel@...r.kernel.org
Subject: Re: 3.0-git15 Atomic scheduling in pidmap_init
On Thu, Aug 18, 2011 at 02:55:40PM -0700, Paul E. McKenney wrote:
> On Thu, Aug 18, 2011 at 02:23:34PM -0700, Paul E. McKenney wrote:
> > On Thu, Aug 18, 2011 at 02:00:34PM -0700, Andrew Morton wrote:
> > > On Thu, 18 Aug 2011 11:35:23 -0700
> > > "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com> wrote:
> > >
> > > > On Wed, Aug 17, 2011 at 07:17:50PM -0400, Josh Boyer wrote:
> > > > > On Thu, Aug 18, 2011 at 01:06:44AM +0200, Frederic Weisbecker wrote:
> > > > > > On Wed, Aug 17, 2011 at 07:02:19PM -0400, Josh Boyer wrote:
> > > > > > > On Wed, Aug 17, 2011 at 03:49:16PM -0700, Paul E. McKenney wrote:
> > > > > > > > On Wed, Aug 17, 2011 at 06:37:35PM -0400, Josh Boyer wrote:
> > > > > > > > > On Mon, Aug 15, 2011 at 08:20:52AM -0700, Paul E. McKenney wrote:
> > > > > > > > > > On Mon, Aug 15, 2011 at 10:04:17AM -0400, Josh Boyer wrote:
> > > > > > > > > > > > Please see the attached.
> > > > > > > > > > >
> > > > > > > > > > > Fixed it up quickly to apply on top of -rc2 and it seems to solve the
> > > > > > > > > > > problem nicely. Thanks for the patch.
> > > > > > > > > >
> > > > > > > > > > Good to hear! I guess I should keep it, then. ;-)
> > > > > > > > >
> > > > > > > > > Hey Paul, were you going to send this to Linus for -rc3? I haven't seen
> > > > > > > > > it come across LKML yet.
> > > > > > > >
> > > > > > > > I might... But does it qualify as a regression? That part of the
> > > > > > > > code hasn't changed for some time now.
> > > > > > >
> > > > > > > It's a fix for a problem that is newly surfaced in 3.1. A regression,
> > > > > > > likely not since it's been there forever, but new debugging options
> > > > > > > uncovered it. I'm pretty sure the -rc stage takes fixes even if they
> > > > > > > aren't regressions.
> > > > > >
> > > > > > Nope, after -rc1 only regressions fixes are taken (most of the time).
> > > > >
> > > > > Sigh.
> > > > >
> > > > > Look, either way I'm carrying this patch in Fedora because it fixes
> > > > > a bug that is actually being reported by users (and by abrtd as well).
> > > > > If you both want to wait until 3.2 to actually submit it to Linus,
> > > > > then OK.
> > > > >
> > > > > Honestly, I'm just glad we actually run with the debug options enabled
> > > > > (which seems to be a rare thing) so bugs like this are actually found.
> > > > > Thanks for the fix.
> > > >
> > > > I am sorry, but I didn't make the rules! And I must carry the fix
> > > > longer as well, if that makes you feel any better.
> > >
> > > bah, we're not that anal. The patch fixes a bug and prevents a nasty
> > > warning spew. Please, send it to Linus.
> >
> > Given your Acked-by and Josh's Tested-by I might consider it. ;-)
> >
> > Speaking of which, Josh, does this patch help Nicolas and Michal?
> >
> > > We appear to be referring to the patch "rcu: Avoid having just-onlined
> > > CPU resched itself when RCU is idle"? If so, the changelog doesn't
> > > even mention that the patch fixes a scheduling-while-atomic warning and
> > > the changelog fails to refer to the redhat bug report. These omissions
> > > should be repaired, please.
> >
> > OK... But I cannot bring myself to believe that my fix does more than
> > hide some other bug. Which is OK, I will just say that in the changelog.
>
> And here is this patch ported to v3.1-rc2, FYI.
>
> Thanx, Paul
>
> ------------------------------------------------------------------------
>
> rcu: Avoid having just-onlined CPU resched itself when RCU is idle
>
> CPUs set rdp->qs_pending when coming online to resolve races with
> grace-period start. However, this means that if RCU is idle, the
> just-onlined CPU might needlessly send itself resched IPIs. Adjust the
> online-CPU initialization to avoid this, and also to correctly cause
> the CPU to respond to the current grace period if needed.
>
> This patch is believed to fix or otherwise suppress problems in
> https://bugzilla.redhat.com/show_bug.cgi?id=726877, however, the
> relationship is not apparent to this patch's author.
>
> Signed-off-by: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>
>
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index ba06207..6986d34 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -1865,8 +1865,6 @@ rcu_init_percpu_data(int cpu, struct rcu_state *rsp, int preemptible)
>
> /* Set up local state, ensuring consistent view of global state. */
> raw_spin_lock_irqsave(&rnp->lock, flags);
> - rdp->passed_quiesc = 0; /* We could be racing with new GP, */
> - rdp->qs_pending = 1; /* so set up to respond to current GP. */
> rdp->beenonline = 1; /* We have now been online. */
> rdp->preemptible = preemptible;
> rdp->qlen_last_fqs_check = 0;
> @@ -1891,8 +1889,15 @@ rcu_init_percpu_data(int cpu, struct rcu_state *rsp, int preemptible)
> rnp->qsmaskinit |= mask;
> mask = rnp->grpmask;
> if (rnp == rdp->mynode) {
> - rdp->gpnum = rnp->completed; /* if GP in progress... */
> + /*
> + * If there is a grace period in progress, we will
> + * set up to wait for it next time we run the
> + * RCU core code.
> + */
> + rdp->gpnum = rnp->completed;
> rdp->completed = rnp->completed;
> + rdp->passed_quiesc = 0;
> + rdp->qs_pending = 1;
In the previous version you had rdp->qs_pending = 0 here.
If it's set to 0 I can understand that it fixes the problem.
Otherwise, set to 1 I don't know how it fixes the thing.
Should it perhaps set it to 1 only if we have rnp->gpnum > rnp->completed ?
> rdp->passed_quiesc_completed = rnp->completed - 1;
> }
> raw_spin_unlock(&rnp->lock); /* irqs already disabled. */
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists