lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAL+tcoCwGQyNSv9BZ_jfsia6YFoyT790iknqxG7bB7wVi3C_vQ@mail.gmail.com>
Date: Wed, 22 Oct 2025 11:11:01 +0800
From: Jason Xing <kerneljasonxing@...il.com>
To: Alessandro Decina <alessandro.d@...il.com>
Cc: netdev@...r.kernel.org, Maciej Fijalkowski <maciej.fijalkowski@...el.com>, 
	"David S. Miller" <davem@...emloft.net>, Alexei Starovoitov <ast@...nel.org>, 
	Andrew Lunn <andrew+netdev@...n.ch>, Daniel Borkmann <daniel@...earbox.net>, 
	Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, 
	Jesper Dangaard Brouer <hawk@...nel.org>, John Fastabend <john.fastabend@...il.com>, 
	Paolo Abeni <pabeni@...hat.com>, Przemek Kitszel <przemyslaw.kitszel@...el.com>, 
	Stanislav Fomichev <sdf@...ichev.me>, Tirthendu Sarkar <tirthendu.sarkar@...el.com>, 
	Tony Nguyen <anthony.l.nguyen@...el.com>, bpf@...r.kernel.org, 
	intel-wired-lan@...ts.osuosl.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH net v2 1/1] i40e: xsk: advance next_to_clean on status descriptors

On Wed, Oct 22, 2025 at 1:33 AM Alessandro Decina
<alessandro.d@...il.com> wrote:
>
> Whenever a status descriptor is received, i40e processes and skips over
> it, correctly updating next_to_process but forgetting to update
> next_to_clean. In the next iteration this accidentally causes the
> creation of an invalid multi-buffer xdp_buff where the first fragment
> is the status descriptor.
>
> If then a skb is constructed from such an invalid buffer - because the
> eBPF program returns XDP_PASS - a panic occurs:
>
> [ 5866.367317] BUG: unable to handle page fault for address: ffd31c37eab1c980
> [ 5866.375050] #PF: supervisor read access in kernel mode
> [ 5866.380825] #PF: error_code(0x0000) - not-present page
> [ 5866.386602] PGD 0
> [ 5866.388867] Oops: Oops: 0000 [#1] SMP NOPTI
> [ 5866.393575] CPU: 34 UID: 0 PID: 0 Comm: swapper/34 Not tainted 6.17.0-custom #1 PREEMPT(voluntary)
> [ 5866.403740] Hardware name: Supermicro AS -2115GT-HNTR/H13SST-G, BIOS 3.2 03/20/2025
> [ 5866.412339] RIP: 0010:memcpy+0x8/0x10
> [ 5866.416454] Code: cc cc 90 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 66 90 48 89 f8 48 89 d1 <f3> a4 e9 fc 26 c0 fe 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90
> [ 5866.437538] RSP: 0018:ff428d9ec0bb0ca8 EFLAGS: 00010286
> [ 5866.443415] RAX: ff2dd26dbd8f0000 RBX: ff2dd265ad161400 RCX: 00000000000004e1
> [ 5866.451435] RDX: 00000000000004e1 RSI: ffd31c37eab1c980 RDI: ff2dd26dbd8f0000
> [ 5866.459454] RBP: ff428d9ec0bb0d40 R08: 0000000000000000 R09: 0000000000000000
> [ 5866.467470] R10: 0000000000000000 R11: 0000000000000000 R12: ff428d9eec726ef8
> [ 5866.475490] R13: ff2dd26dbd8f0000 R14: ff2dd265ca2f9fc0 R15: ff2dd26548548b80
> [ 5866.483509] FS:  0000000000000000(0000) GS:ff2dd2c363592000(0000) knlGS:0000000000000000
> [ 5866.492600] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 5866.499060] CR2: ffd31c37eab1c980 CR3: 0000000178d7b040 CR4: 0000000000f71ef0
> [ 5866.507079] PKRU: 55555554
> [ 5866.510125] Call Trace:
> [ 5866.512867]  <IRQ>
> [ 5866.515132]  ? i40e_clean_rx_irq_zc+0xc50/0xe60 [i40e]
> [ 5866.520921]  i40e_napi_poll+0x2d8/0x1890 [i40e]
> [ 5866.526022]  ? srso_alias_return_thunk+0x5/0xfbef5
> [ 5866.531408]  ? raise_softirq+0x24/0x70
> [ 5866.535623]  ? srso_alias_return_thunk+0x5/0xfbef5
> [ 5866.541011]  ? srso_alias_return_thunk+0x5/0xfbef5
> [ 5866.546397]  ? rcu_sched_clock_irq+0x225/0x1800
> [ 5866.551493]  __napi_poll+0x30/0x230
> [ 5866.555423]  net_rx_action+0x20b/0x3f0
> [ 5866.559643]  handle_softirqs+0xe4/0x340
> [ 5866.563962]  __irq_exit_rcu+0x10e/0x130
> [ 5866.568283]  irq_exit_rcu+0xe/0x20
> [ 5866.572110]  common_interrupt+0xb6/0xe0
> [ 5866.576425]  </IRQ>
> [ 5866.578791]  <TASK>
>
> Advance next_to_clean to ensure invalid xdp_buff(s) aren't created.
>
> Fixes: 1c9ba9c14658 ("i40e: xsk: add RX multi-buffer support")
> Signed-off-by: Alessandro Decina <alessandro.d@...il.com>
> ---
>  drivers/net/ethernet/intel/i40e/i40e_xsk.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> index 9f47388eaba5..dbc19083bbb7 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> @@ -441,13 +441,18 @@ int i40e_clean_rx_irq_zc(struct i40e_ring *rx_ring, int budget)
>                 dma_rmb();
>
>                 if (i40e_rx_is_programming_status(qword)) {
> +                       u16 ntp;
> +
>                         i40e_clean_programming_status(rx_ring,
>                                                       rx_desc->raw.qword[0],
>                                                       qword);
>                         bi = *i40e_rx_bi(rx_ring, next_to_process);
>                         xsk_buff_free(bi);
> -                       if (++next_to_process == count)
> +                       ntp = next_to_process++;
> +                       if (next_to_process == count)
>                                 next_to_process = 0;
> +                       if (next_to_clean == ntp)
> +                               next_to_clean = next_to_process;
>                         continue;
>                 }
>
> --
> 2.43.0
>
>

I'm copying your reply from v1 as shown below so that we can continue
with the discussion :)

> It really depends on whether a status descriptor can be received in the
> middle of multi-buffer packet. Based on the existing code, I assumed it
> can. Therefore, consider this case:
>
> [valid_1st_packet][status_descriptor][valid_2nd_packet]
>
> In this case you want to skip status_descriptor but keep the existing
> logic that leads to:
>
>     first = next_to_clean = valid_1st_packet
>
> so then you can go and add valid_2nd_packet as a fragment to the first.

Sorry, honestly, I still don't follow you.

Looking at the case you provided, I think @first always pointing to
valid_1st_packet is valid which does not bring any trouble. You mean
the case is what you're trying to handle?

You patch updates next_to_clean that is only used at the very
beginning, so it will not affect @first. Imaging the following case:

     [status_descriptor][valid_1st_packet][valid_2nd_packet]

Even if the next_to_clean is updated, the @first still points to
[status_descriptor] that is invalid and that will later cause the
panic when constructing the skb.

I'm afraid that we're not on the same page. Let me confirm that it is
@first that points to the status descriptor that causes the panic,
right? Could you share with us the exact case just like you did as
above. Thank you.

Thanks,
Jason

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ