[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CA+FuTSebSMp76U83RpGZVxf=p_Zb9FNH2kScB521x1Z+1zEu9g@mail.gmail.com>
Date: Mon, 31 Oct 2022 17:16:42 -0400
From: Willem de Bruijn <willemdebruijn.kernel@...il.com>
To: Jiri Benc <jbenc@...hat.com>
Cc: Willem de Bruijn <willemdebruijn.kernel@...il.com>,
Shmulik Ladkani <shmulik.ladkani@...il.com>,
netdev@...r.kernel.org, Eric Dumazet <eric.dumazet@...il.com>,
Tomas Hruby <tomas@...era.io>,
Jeremi Piotrowski <jpiotrowski@...ux.microsoft.com>,
alexanderduyck@...a.com, Jakub Kicinski <kuba@...nel.org>
Subject: Re: [PATCH net] net: gso: fix panic on frag_list with mixed head
alloc types
On Mon, Oct 31, 2022 at 12:53 PM Jiri Benc <jbenc@...hat.com> wrote:
>
> On Sat, 29 Oct 2022 10:10:03 -0400, Willem de Bruijn wrote:
> > If a device has different allocation strategies depending on packet
> > size, and GRO coalesces those using a list, then indeed this does not
> > have to hold. GRO requires the same packet size and thus allocation
> > strategy to coalesce -- except for the last segment.
>
> That's exactly what I saw: the last segment was different.
>
> However, I don't see anything in the GRO code that enforces that. It
> appears that currently, it just usually happens that way. When there's
> a burst of packets for the given flow on the wire, only the last
> segment is small (and thus malloced) and there's no immediate packet
> following for the same flow. What would happen if (for whatever reason)
> there was such packet following?
This is enforced. In tcp_gro_receive:
/* If skb is a GRO packet, make sure its gso_size matches
prior packet mss.
* If it is a single frame, do not aggregate it if its length
* is bigger than our mss.
*/
if (unlikely(skb_is_gso(skb)))
flush |= (mss != skb_shinfo(skb)->gso_size);
else
flush |= (len - 1) >= mss;
[..]
/* Force a flush if last segment is smaller than mss. */
if (unlikely(skb_is_gso(skb)))
flush = len != NAPI_GRO_CB(skb)->count *
skb_shinfo(skb)->gso_size;
else
flush = len < mss;
That branch and the comment is very new, introduced with GRO support
for HW-GRO packets. But the else clauses are not new.
>
> > I don't see any allocation in vmxnet3 that uses a head frag, though.
> > There is a small packet path (rxDataRingUsed), but both small and
> > large allocate using a kmalloc-backed skb->data as far as I can tell.
>
> I believe the logic is that for rxDataRingUsed,
> netdev_alloc_skb_ip_align is called to alloc skb to copy data into,
> passing to it the actual packet length. If it's small enough,
> __netdev_alloc_skb will kmalloc the data. However, for !rxDataRingUsed,
> the skb for dma buffer is allocated with a larger length and
> __netdev_alloc_skb will use page_frag_alloc.
That explains perfectly.
> I admit I've not spend that much time understanding the logic in the
> driver. I was satisfied when the perceived logic matched what I saw in
> the kernel memory dump. I may have easily missed something, such as
> Jakub's point that it's not actually the driver deciding on the
> allocation strategy but rather __netdev_alloc_skb on its own. But the
> outcome still seems to be that small packets are kmalloced, while
> larger packets are page backed. Am I wrong?
>
> > In any case, iterating over all frags is more robust. This is an edge
> > case, fine to incur the cost there.
>
> Thanks! We might get a minor speedup if we check only the last segment;
> but first, I'd like to be proven wrong about GRO not enforcing this.
> Plus, I wonder whether the speedup would be measurable if we have to
> iterate through the list to find the last segment anyway.
>
> Unless there are objections or clarifications (and unless I'm wrong
> above), I'll send a v2 with the commit message corrected and with the
> same code.
Sounds good to me, thanks.
>
> Thanks to all!
>
> Jiri
>
Powered by blists - more mailing lists