netdev - Re: [PATCH net] net: gso: fix panic on frag

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20221031175206.50a54083@griffin>
Date:   Mon, 31 Oct 2022 17:52:06 +0100
From:   Jiri Benc <jbenc@...hat.com>
To:     Willem de Bruijn <willemdebruijn.kernel@...il.com>
Cc:     Shmulik Ladkani <shmulik.ladkani@...il.com>,
        netdev@...r.kernel.org, Eric Dumazet <eric.dumazet@...il.com>,
        Tomas Hruby <tomas@...era.io>,
        Jeremi Piotrowski <jpiotrowski@...ux.microsoft.com>,
        alexanderduyck@...a.com, Jakub Kicinski <kuba@...nel.org>
Subject: Re: [PATCH net] net: gso: fix panic on frag_list with mixed head
 alloc types

On Sat, 29 Oct 2022 10:10:03 -0400, Willem de Bruijn wrote:
> If a device has different allocation strategies depending on packet
> size, and GRO coalesces those using a list, then indeed this does not
> have to hold. GRO requires the same packet size and thus allocation
> strategy to coalesce -- except for the last segment.

That's exactly what I saw: the last segment was different.

However, I don't see anything in the GRO code that enforces that. It
appears that currently, it just usually happens that way. When there's
a burst of packets for the given flow on the wire, only the last
segment is small (and thus malloced) and there's no immediate packet
following for the same flow. What would happen if (for whatever reason)
there was such packet following?

> I don't see any allocation in vmxnet3 that uses a head frag, though.
> There is a small packet path (rxDataRingUsed), but both small and
> large allocate using a kmalloc-backed skb->data as far as I can tell.

I believe the logic is that for rxDataRingUsed,
netdev_alloc_skb_ip_align is called to alloc skb to copy data into,
passing to it the actual packet length. If it's small enough,
__netdev_alloc_skb will kmalloc the data. However, for !rxDataRingUsed,
the skb for dma buffer is allocated with a larger length and
__netdev_alloc_skb will use page_frag_alloc.

I admit I've not spend that much time understanding the logic in the
driver. I was satisfied when the perceived logic matched what I saw in
the kernel memory dump. I may have easily missed something, such as
Jakub's point that it's not actually the driver deciding on the
allocation strategy but rather __netdev_alloc_skb on its own. But the
outcome still seems to be that small packets are kmalloced, while
larger packets are page backed. Am I wrong?

> In any case, iterating over all frags is more robust. This is an edge
> case, fine to incur the cost there.

Thanks! We might get a minor speedup if we check only the last segment;
but first, I'd like to be proven wrong about GRO not enforcing this.
Plus, I wonder whether the speedup would be measurable if we have to
iterate through the list to find the last segment anyway.

Unless there are objections or clarifications (and unless I'm wrong
above), I'll send a v2 with the commit message corrected and with the
same code.

Thanks to all!

 Jiri