[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090709104412.GA3651@ami.dom.local>
Date: Thu, 9 Jul 2009 12:44:12 +0200
From: Jarek Poplawski <jarkao2@...il.com>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: Andres Freund <andres@...razel.de>,
Joao Correia <joaomiguelcorreia@...il.com>,
Arun R Bharadwaj <arun@...ux.vnet.ibm.com>,
Stephen Hemminger <shemminger@...tta.com>,
netdev@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>,
Patrick McHardy <kaber@...sh.net>,
Peter Zijlstra <peterz@...radead.org>
Subject: Re: Soft-Lockup/Race in networking in 2.6.31-rc1+195 (
possibly?caused by netem)
On Thu, Jul 09, 2009 at 12:31:53PM +0200, Thomas Gleixner wrote:
> On Thu, 9 Jul 2009, Jarek Poplawski wrote:
> > On Thu, Jul 09, 2009 at 12:23:17AM +0200, Andres Freund wrote:
> > ...
> > > Unfortunately this just yields the same backtraces during softlockup and not
> > > earlier.
> > > I did not test without lockdep yet, but that should not have stopped the BUG
> > > from appearing, right?
> >
> > Since it looks like hrtimers now, these changes in timers shouldn't
> > matter. Let's wait for new ideas.
>
> Some background:
...
> There is another oddity in cbq_undelay() which is the hrtimer callback
> function:
>
> if (delay) {
> ktime_t time;
>
> time = ktime_set(0, 0);
> time = ktime_add_ns(time, PSCHED_TICKS2NS(now + delay));
> hrtimer_start(&q->delay_timer, time, HRTIMER_MODE_ABS);
>
> The canocial way to restart a hrtimer from the callback function is to
> set the expiry value and return HRTIMER_RESTART.
OK, that's for later because we didn't use cbq here.
>
> }
>
> sch->flags &= ~TCQ_F_THROTTLED;
> __netif_schedule(qdisc_root(sch));
> return HRTIMER_NORESTART;
>
> Again, this should not cause the timer to be enqueued on another CPU
> as we do not enqueue on a different CPU when the callback is running,
> but see above ...
>
> I have the feeling that the code relies on some implicit cpu
> boundness, which is not longer guaranteed with the timer migration
> changes, but that's a question for the network experts.
As a matter of fact, I've just looked at this __netif_schedule(),
which really is cpu bound, so you might be 100% right.
Thanks for your help,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists