linux-kernel - Re: [PATCH net] net: atlantic: fix fragment overflow handling in RX path

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CADEc0q4sEACJY03CYxOWPPvPrB=n7==2KqHz57AY+CR6gSJjAw@mail.gmail.com>
Date: Wed, 19 Nov 2025 16:38:13 +0800
From: Jiefeng <jiefeng.z.zhang@...il.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: netdev@...r.kernel.org, davem@...emloft.net, pabeni@...hat.com, 
	andrew+netdev@...n.ch, edumazet@...gle.com, linux-kernel@...r.kernel.org, 
	irusskikh@...vell.com
Subject: Re: [PATCH net] net: atlantic: fix fragment overflow handling in RX path

On Wed, Nov 19, 2025 at 4:24 AM Jakub Kicinski <kuba@...nel.org> wrote:
>
> On Tue, 18 Nov 2025 15:04:02 +0800 jiefeng.z.zhang@...il.com wrote:
> > From: Jiefeng Zhang <jiefeng.z.zhang@...il.com>
> >
> > The atlantic driver can receive packets with more than MAX_SKB_FRAGS (17)
> > fragments when handling large multi-descriptor packets. This causes an
> > out-of-bounds write in skb_add_rx_frag_netmem() leading to kernel panic.
> >
> > The issue occurs because the driver doesn't check the total number of
> > fragments before calling skb_add_rx_frag(). When a packet requires more
> > than MAX_SKB_FRAGS fragments, the fragment index exceeds the array bounds.
> >
> > Add a check in __aq_ring_rx_clean() to ensure the total number of fragments
> > (including the initial header fragment and subsequent descriptor fragments)
> > does not exceed MAX_SKB_FRAGS. If it does, drop the packet gracefully
> > and increment the error counter.
>
> First off, some basic Linux mailing list savoir vivre:
>  - please don't top post
>  - please don't resubmit your code within 24h of previous posting
>  - please wait for a discussion to close before you send another version
>
> Quoting your response:
>
> https://lore.kernel.org/all/CADEc0q6iLdpwYsyGAwH4qzST8G7asjdqgR6+ymXMy1k0wRwhNQ@mail.gmail.com/
>
> > I have used git send-email to send my code.
> >
> > As for the patch --The aquantia/atlantic driver supports a maximum of
> > AQ_CFG_SKB_FRAGS_MAX (32U) fragments, while the kernel limits the
> > maximum number of fragments to MAX_SKB_FRAGS (17).
>
> Frag count limits in drivers are usually for Tx not Rx.
> Again, why do you think this driver can generate more frags than 17?
> --
> pw-bot: cr

Thank you for the feedback. You're right that fragment limits are
usually for TX, but this is a special case in the RX path.

Also, I apologize for resubmitting the patch within 24 hours. The
initial submission had formatting issues, so I resubmitted it using
git send-email to ensure proper formatting. I understand the mailing
list etiquette and will wait for discussion before sending another
version in the future.

The atlantic hardware supports multi-descriptor packet reception
(RSC). When a large packet arrives, the hardware splits it across
multiple descriptors, where each descriptor can hold up to 2048 bytes
(AQ_CFG_RX_FRAME_MAX).

There is a logic bug in the code. The code already counts fragments in
the multi-descriptor loop
(frag_cnt at aq_ring.c:559), but the check at aq_ring.c:568 only considers
frag_cnt, not the additional fragment from the first buffer
(if buff->len > hdr_len). The actual fragment addition happens later
(aq_ring.c:634-667):
- One fragment from the first buffer (if hdr_len < buff->len)
- Plus frag_cnt fragments from subsequent descriptors

This can exceed MAX_SKB_FRAGS (17) in edge cases:
- If frag_cnt = 17 (the check passes)
- And the first buffer has a fragment (buff->len > hdr_len)
- Then total = 1 + 17 = 18 > MAX_SKB_FRAGS

While the hardware MTU limit is 16334 bytes (B0/ATL2), which should
only need ~8 fragments, there are edge cases like LRO aggregation
or hardware anomalies that could produce more descriptors.

The panic occurred because skb_add_rx_frag() was called with an index
>= MAX_SKB_FRAGS, causing an out-of-bounds write. The fix ensures
we check the total fragment count (first buffer fragment + frag_cnt)
before calling skb_add_rx_frag().

And I have encountered this crash in production with an
Aquantia(AQtion AQC113) 10G NIC[Antigua 10G]:

```
<4>[175432.612171] RIP: 0010:skb_add_rx_frag_netmem+0x29/0xd0
<4>[175432.612193] Code: 90 f3 0f 1e fa 0f 1f 44 00 00 48 89 f8 41 89
ca 48 89 d7 48 63 ce 8b 90 c0 00 00 00 48 c1 e1 04 48 01 ca 48 03 90
c8 00 00 00 <48> 89 7a 30 44 89 52 3c 44 89 42 38 40 f6 c7 01 75 74 48
89 fa 83
<4>[175432.612212] RSP: 0018:ffffa9bec02a8d50 EFLAGS: 00010287
<4>[175432.612223] RAX: ffff925b22e80a00 RBX: ffff925ad38d2700 RCX:
fffffffe0a0c8000
<4>[175432.612234] RDX: ffff9258ea95bac0 RSI: ffff925ae0a0c800 RDI:
0000000000037a40
<4>[175432.612244] RBP: 0000000000000024 R08: 0000000000000000 R09:
0000000000000021
<4>[175432.612254] R10: 0000000000000848 R11: 0000000000000000 R12:
ffffa9bec02a8e24
<4>[175432.612264] R13: ffff925ad8615570 R14: 0000000000000000 R15:
ffff925b22e80a00
<4>[175432.612274] FS: 0000000000000000(0000)
GS:ffff925e47880000(0000) knlGS:0000000000000000
<4>[175432.612287] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[175432.612296] CR2: ffff9258ea95baf0 CR3: 0000000166022004 CR4:
0000000000f72ef0
<4>[175432.612307] PKRU: 55555554
<4>[175432.612314] Call Trace:
<4>[175432.612323] <IRQ>
<4>[175432.612334] aq_ring_rx_clean+0x175/0xe60 [atlantic]
<4>[175432.612398] ? aq_ring_rx_clean+0x14d/0xe60 [atlantic]
<4>[175432.612455] ? aq_ring_tx_clean+0xdf/0x190 [atlantic]
<4>[175432.612508] ? kmem_cache_free+0x348/0x450
<4>[175432.612525] ? aq_vec_poll+0x81/0x1d0 [atlantic]
<4>[175432.612575] ? __napi_poll+0x28/0x1c0
<4>[175432.612587] ? net_rx_action+0x337/0x420
```