[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Mon, 5 May 2008 07:50:40 +0000
From: Jarek Poplawski <jarkao2@...il.com>
To: Jay Cliburn <jacliburn@...lsouth.net>
Cc: linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
Chris Snook <csnook@...hat.com>
Subject: Re: Need help debugging memory corruption
On Mon, May 05, 2008 at 07:27:34AM +0000, Jarek Poplawski wrote:
> On Sun, May 04, 2008 at 02:55:29PM -0500, Jay Cliburn wrote:
> > On Sun, 04 May 2008 16:55:08 +0200
> > Jarek Poplawski <jarkao2@...il.com> wrote:
> >
> > > Jarek Poplawski wrote, On 05/04/2008 04:24 PM:
> > > ...
> > >
> > > > I'm definitely with less experience, so I wonder why it can't be
> > > > a simple race between atl1_clean_rx_ring() and something (maybe even
> > > > pending atl1_intr_rx()) on the other cpu writing skb while kfreeing?
> > >
> > >
> > > Hmm... atl1_intr_rx() looks impossible, so atl1_alloc_rx_buffers()?
> >
> > I booted with nosmp and the bug is *much* harder to hit, but I still
> > hit it once out of about 10 tries. Does the fact that I hit it once
> > using nosmp disprove the race theory?
>
> Probably not: I don't know how about preemption model, but especially
> some maybe unkilled timers/watchdogs or workqueues could be considered.
> Of course this idea looks very unprobable (should happen with less
> than 4GB too), but should be quite easy to verify by adding some
> temporary spinlocks around these rx ring operations?
Hmm#2... Of course for testing without smp disabling local irqs should
be enough. BTW... since this check is done during __free_slab it seems
such a corruption could take place before this closing, so maybe this
atl1_intr_rx() should be considered in these races yet...
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists