[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130916065047.GH27487@1wt.eu>
Date: Mon, 16 Sep 2013 08:50:47 +0200
From: Willy Tarreau <w@....eu>
To: Thomas Petazzoni <thomas.petazzoni@...e-electrons.com>
Cc: Ethan Tuttle <ethan@...antuttle.com>, Andrew Lunn <andrew@...n.ch>,
Jason Cooper <jason@...edaemon.net>, netdev@...r.kernel.org,
Ezequiel Garcia <ezequiel.garcia@...e-electrons.com>,
Gregory Clément
<gregory.clement@...e-electrons.com>,
linux-arm-kernel@...ts.infradead.org
Subject: Re: mvneta: oops in __rcu_read_lock on mirabox
Hi Thomas,
On Sun, Sep 15, 2013 at 08:57:01PM +0200, Thomas Petazzoni wrote:
> Hello Ethan,
>
> On Sat, 14 Sep 2013 18:05:32 -0700, Ethan Tuttle wrote:
> > When I upgraded my mirabox from 3.11-rc4 to 3.11, I started seeing
> > oopses while receiving network traffic (see below). Sending a flood
> > ping will trigger the oops within a few minutes.
> >
> > The stack looks similar, but not identical to, the one reported
> > earlier by Jochen De Smet[1]. In my case the PC is always
> > __rcu_read_lock.
> >
> > A git bisect found a878764 "Merge
> > git://git.kernel.org/pub/scm/linux/kernel/git/davem/net" to be the
> > first bad commit... interesting, because neither of the merge parents
> > produce the oops. I rebased the net changes onto the other merge
> > parent and bisected that series, which identified 702821f "net: revert
> > 8728c544a9c ("net: dev_pick_tx() fix")" as the first bad commit.
> > Indeed, reverting 702821f from 3.11 produces a kernel which stands up
> > to a ping flood for hours.
> >
> > Each of the times I reproduced this, it was identified as "Unhandled
> > prefetch abort: unknown 25 (0x409) at 0xc0036ea0", except once when I
> > got "unknown 16 (0x400)".
> >
> > I'm assuming this is an mvneta bug that was exposed by 702821f.
> > That's just a guess, and I don't have the skills to debug this any
> > further. In any case, I figured the maintainers would want to know
> > about it.
>
> Thanks a lot for the report and the detailed investigation.
> Unfortunately, I don't have Armada 370 hardware with me this week, so
> I'm unable to test and reproduce the issue.
>
> However, I've added a bunch of Armada 370 people/maintainers in Cc,
> hopefully they can at least try to reproduce and confirm that reverting
> this patch makes the problem go away, which would confirm that we
> should look for a bug in the mvneta driver around this problem.
I'm currently testing on 3.11.1 (which I had here) and am not getting
any issue after 50M packets. My kernel is running in thumb mode and
without SMP.
Ethan, we'll need your config I guess.
Thanks,
Willy
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists