[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
<CY5PR21MB375904FD3437BA610E6BDBD1BF392@CY5PR21MB3759.namprd21.prod.outlook.com>
Date: Sat, 30 Mar 2024 00:12:00 +0000
From: Dexuan Cui <decui@...rosoft.com>
To: Haiyang Zhang <haiyangz@...rosoft.com>, "linux-hyperv@...r.kernel.org"
<linux-hyperv@...r.kernel.org>, "netdev@...r.kernel.org"
<netdev@...r.kernel.org>
CC: stephen <stephen@...workplumber.org>, KY Srinivasan <kys@...rosoft.com>,
Paul Rosswurm <paulros@...rosoft.com>, "olaf@...fle.de" <olaf@...fle.de>,
"vkuznets@...hat.com" <vkuznets@...hat.com>, "davem@...emloft.net"
<davem@...emloft.net>, "wei.liu@...nel.org" <wei.liu@...nel.org>,
"edumazet@...gle.com" <edumazet@...gle.com>, "kuba@...nel.org"
<kuba@...nel.org>, "pabeni@...hat.com" <pabeni@...hat.com>, "leon@...nel.org"
<leon@...nel.org>, Long Li <longli@...rosoft.com>,
"ssengar@...ux.microsoft.com" <ssengar@...ux.microsoft.com>,
"linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>,
"daniel@...earbox.net" <daniel@...earbox.net>, "john.fastabend@...il.com"
<john.fastabend@...il.com>, "bpf@...r.kernel.org" <bpf@...r.kernel.org>,
"ast@...nel.org" <ast@...nel.org>, "sharmaajay@...rosoft.com"
<sharmaajay@...rosoft.com>, "hawk@...nel.org" <hawk@...nel.org>,
"tglx@...utronix.de" <tglx@...utronix.de>, "shradhagupta@...ux.microsoft.com"
<shradhagupta@...ux.microsoft.com>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "stable@...r.kernel.org"
<stable@...r.kernel.org>
Subject: RE: [PATCH net] net: mana: Fix Rx DMA datasize and skb_over_panic
> From: LKML haiyangz <lkmlhyz@...rosoft.com> On Behalf Of Haiyang Zhang
> Sent: Friday, March 29, 2024 2:37 PM
> [...]
> mana_get_rxbuf_cfg() aligns the RX buffer's DMA datasize to be
> multiple of 64. So a packet slightly bigger than mtu+14, say 1536,
> can be received and cause skb_over_panic.
>
> Sample dmesg:
> [ 5325.237162] skbuff: skb_over_panic: text:ffffffffc043277a len:1536
> put:1536 head:ff1100018b517000 data:ff1100018b517100 tail:0x700
> end:0x6ea dev:<NULL>
> [ 5325.243689] ------------[ cut here ]------------
> [ 5325.245748] kernel BUG at net/core/skbuff.c:192!
> [ 5325.247838] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
> [ 5325.258374] RIP: 0010:skb_panic+0x4f/0x60
> [ 5325.302941] Call Trace:
> [ 5325.304389] <IRQ>
> [ 5325.315794] ? skb_panic+0x4f/0x60
> [ 5325.317457] ? asm_exc_invalid_op+0x1f/0x30
> [ 5325.319490] ? skb_panic+0x4f/0x60
> [ 5325.321161] skb_put+0x4e/0x50
> [ 5325.322670] mana_poll+0x6fa/0xb50 [mana]
> [ 5325.324578] __napi_poll+0x33/0x1e0
> [ 5325.326328] net_rx_action+0x12e/0x280
>
> As discussed internally, this alignment is not necessary. To fix
> this bug, remove it from the code. So oversized packets will be
> marked as CQE_RX_TRUNCATED by NIC, and dropped.
>
> Cc: stable@...r.kernel.org
> Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network
> Adapter (MANA)")
While the unnecessary alignment indeed first appeared in ca9c54d2d6a5
(Apr 2021), ca9c54d2d6a5 didn't cause any crash because the only
supported MTU size was 1500, and the RX buffer was a page (which is
big enough to hold an incoming packet of a size up to 1536 bytes), and
mana_rx_skb() created a big skb of 1 page (which is big enough to
hold the incoming packet); the only issue with ca9c54d2d6a5 was that
an incoming packet of 1515~1536 bytes (if such a packet was ever
received by the NIC) was still properly delivered to the upper layer
network stack, while ideally such a packet should be dropped by the
NIC driver as we would have received CQE_RX_TRUNCATED if
ca9c54d2d6a5 hadn't done the unnecessary alignment.
So, my understanding is that ca9c54d2d6a5 is imperfect
but it doesn't really cause any real issue.
It looks like the real issue started in the commit (Apr 12 14:16:02 2023)
2fbbd712baf1 ("net: mana: Enable RX path to handle various MTU sizes")
which introduces "rxq->alloc_size", which I think can be
smaller than rxq->datasize, and is used in mana_poll() --> ... ->
mana_build_skb(), which I think causes the crash because
the size of the allocated skb may not be able to hold a big
incoming packet.
Later, the commit (Apr 12 14:16:03 2023)
80f6215b450e ("net: mana: Add support for jumbo frame")
does code refactoring and introduces mana_get_rxbuf_cfg().
I suggest the Fixes tag should be updated to:
Fixes: 2fbbd712baf1 ("net: mana: Enable RX path to handle various MTU sizes")
, because if we used "Fixes: ca9c54d2d6a5", the distro vendors
would ask us: does this fix need to be backported to old kernels
that don't have 2fbbd712baf1? The fix can't apply cleanly there
and is not really needed there. The stable tree maintainers would
ask the same question.
> drivers/net/ethernet/microsoft/mana/mana_en.c | 2 +-
> include/net/mana/mana.h | 1 -
> 2 files changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c
> b/drivers/net/ethernet/microsoft/mana/mana_en.c
> index 59287c6e6cee..d8af5e7e15b4 100644
> --- a/drivers/net/ethernet/microsoft/mana/mana_en.c
> +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
> @@ -601,7 +601,7 @@ static void mana_get_rxbuf_cfg(int mtu, u32
> *datasize, u32 *alloc_size,
>
> *alloc_size = mtu + MANA_RXBUF_PAD + *headroom;
>
> - *datasize = ALIGN(mtu + ETH_HLEN, MANA_RX_DATA_ALIGN);
> + *datasize = mtu + ETH_HLEN;
> }
>
I suggest the Fixes tag should be updated. Otherwise the fix
looks good to me.
Reviewed-by: Dexuan Cui <decui@...rosoft.com>
Powered by blists - more mailing lists