[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1338006309.10135.15.camel@edumazet-glaptop>
Date: Sat, 26 May 2012 06:25:09 +0200
From: Eric Dumazet <eric.dumazet@...il.com>
To: Sam Portolla <samportolla@...oo.com>
Cc: Hugh Dickins <hughd@...gle.com>,
"kaber@...sh.net" <kaber@...sh.net>,
"jarkao2@...il.com" <jarkao2@...il.com>,
"davem@...emloft.net" <davem@...emloft.net>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: exit_mmap BUG_ON in 2.6.23 (and Add qdisc __NET_XMIT_STOLEN)
On Fri, 2012-05-25 at 17:28 -0700, Sam Portolla wrote:
Please don't top post on this list
>
> [pease cc samPortolla@...oo.com on the replies; not a member of this
> mailer]
>
> Hi Hugh,
>
> Thank you! It turns out our 2.6.23 kernel does not have this old
> patch, I am also adding Jarek, David and Patrick who were involved in
> the below fix for their insights:
>
>
> commit 378a2f090f7a478704a372a4869b8a9ac206234e
> Date: Mon Aug 4 22:31:03 2008 -0700
> net_sched: Add qdisc __NET_XMIT_STOLEN flag
> In this failure case below, as well as some others, the ethernet
> driver printed a transmit timeout just before the crash.
>
> It seems since we don't have the above patch, the kernel qdisc Tx
> packet path for fragmented packets can be messed up and corrupt the
> skb it passes to drivers, which in the historic case that led to
> above fix, caused an skb NULL ptr de-ref in the driver itself (which
> we also saw once).
>
> Jarek, David or Patrick,
>
> Could the lack of above patch cause the kernel to also falsely detect
> transmit timeouts on various drivers as it can not properly keep track
> of packets transmitted? Can you please elaborate so a newbie like me
> can understand?
>
> Is the above commit the sole one required for the kernel panic/skb
> NULL de-ref driver issue or is there more needed fixes later on that
> can be backported to an older kernel (2.6.23 GNU/Linux x86_64)?
>
Transmit timeouts are because of races in some network drivers.
The device stay in XOFF state for too long time (forever as a matter of
fact once the race triggered)
Since 2.6.23 we fixed a lot of them, but still races still exist.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists