[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4EC6A38E.6060404@iki.fi>
Date: Fri, 18 Nov 2011 20:27:26 +0200
From: Timo Teräs <timo.teras@....fi>
To: Eric Dumazet <eric.dumazet@...il.com>
CC: Nick Bowler <nbowler@...iptictech.com>, netdev@...r.kernel.org,
"David S. Miller" <davem@...emloft.net>
Subject: Re: Occasional oops with IPSec and IPv6.
On 11/18/2011 06:39 PM, Eric Dumazet wrote:
> Le vendredi 18 novembre 2011 à 11:27 -0500, Nick Bowler a écrit :
>> On 2011-11-17 14:09 -0500, Nick Bowler wrote:
>>> One of the tests we do with IPsec involves sending and receiving UDP
>>> datagrams of all sizes from 1 to N bytes, where N is much larger than
>>> the MTU. In this particular instance, the MTU is 1500 bytes and N is
>>> 10000 bytes. This test works fine with IPv4, but I'm getting an
>>> occasional oops on Linus' master with IPv6 (output at end of email). We
>>> also run the same test where N is less than the MTU, and it does not
>>> trigger this issue. The resulting fallout seems to eventually lock up
>>> the box (although it continues to work for a little while afterwards).
>>>
>>> The issue appears timing related, and it doesn't always occur. This
>>> probably also explains why I've not seen this issue before now, as we
>>> recently upgraded all our lab systems to machines from this century
>>> (with newfangled dual core processors). This also makes it somewhat
>>> hard to reproduce, but I can trigger it pretty reliably by running 'yes'
>>> in an ssh session (which doesn't use IPsec) while running the test:
>>> it'll usually trigger in 2 or 3 runs. The choice of cipher suite
>>> appears to be irrelevant.
>>>
>>> I built a relatively old kernel (2.6.34) and could not reproduce the
>>> issue there, so I ran a git bisect. It pointed to the following, which
>>> (unsurprisingly) no longer reverts cleanly.
>>>
>>> Let me know if you need any more info. I'll see if I can reproduce the
>>> issue with a smaller test case...
>>
>> OK, here's a somewhat straigthforward way to reproduce it that I've
>> found. It uses a short test program called "udp_burst" which simply
>> transmits a bunch of UDP datagrams at all sizes between 1 and 10000,
>> included at the end of this mail.
>>[snip]
>
> Please note commit 80c802f307 added a known bug, fixed in commit
> 0b150932197b (xfrm: avoid possible oopse in xfrm_alloc_dst)
>
> Given commit 80c802f307 complexity, we can assume other bugs are to be
> fixed as well.
>
> Unfortunately, Timo seems unresponsive.
This looks quite different. And I've been trying to figure out what
causes this. However, the OOPS happens at ip6_fragment(), indicating
that there was not enough allocated headroom (skb underrun). My initial
thought is ipv6 bug that just got uncovered by my commit; especially
since ipv4 side is happy. But I haven't yet been able to figure this one
out.
Could you also try Herbert's latest patch set:
[0/6] Replace LL_ALLOCATED_SPACE to allow needed_headroom adjustment
This changes how the headroom is calculated, and *might* fix this issue
too if it's caused by the same SMP race condition which got uncovered by
my other commit earlier.
- Timo
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists