[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4B0883FD.2090806@trash.net>
Date: Sun, 22 Nov 2009 01:21:17 +0100
From: Patrick McHardy <kaber@...sh.net>
To: David Miller <davem@...emloft.net>
CC: ben@...footnetworks.com, netdev@...r.kernel.org
Subject: Re: Bridge + Conntrack + SKB Recycle: Fragment Reassembly Errors
David Miller wrote:
> From: Patrick McHardy <kaber@...sh.net>
> Date: Tue, 10 Nov 2009 17:50:38 +0100
>
>> This code in ip_fragment() looks suspicious:
>>
>> if (skb_has_frags(skb)) {
>> ...
>> skb_walk_frags(skb, frag) {
>> ...
>> if (skb->sk) {
>> frag->sk = skb->sk;
>> frag->destructor = sock_wfree;
>> truesizes += frag->truesize;
>> }
>>
>> truesizes is later used to adjust truesize of the head skb.
>> For some reason this is only done when it originated from a
>> local socket.
>
> Well, it shouldn't look _that_ suspicious.
>
> What this code is doing is making sure that after we make all of these
> changes, the truesize of the SKBs referrng to the socket do not
> change.
>
> It's simply making sure that the math works out when all the
> sock_wfree() calls occur later.
>
> If we don't have a socket involved, there is no reason to make
> these adjustments.
That seems to be the assumption. But ip_defrag() uses skb->truesize
as well to make sure the defragmentation memory limits are not exceeded.
In this case what seems to be happening is:
- gianfar with skb recycling enabled receives a number of fragments
on a bridge (~64000b total)
- conntrack defragments the packet using ip_defrag(), which causes
skb->truesize of the head fragment to account for all the fragments
- the packet is refragmented in the bridging code using ip_fragment().
This doesn't re-adjust skb->truesize of the head fragment when the
packet is not associated with a socket
- the head is recycled in gianfar
- another fragment is received and reuses the recycled skb with a
huge truesize
- the defragmentation limits are exceeded due to the huge truesize
So it seems we need to adjust skb->truesize in ip_fragment() since
skb_recycle_check() assumes the skb is linear (and therefore
skb->truesize reflects the linear size). Ben's suggestions of adding
an upper limit based on the requested size to skb_recycle_check()
makes sense to me as well to avoid this problem when recycling large
linear skbs.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists