lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250602161857eucms1p2fb159a3058fd7bf2b668282529226830@eucms1p2>
Date: Mon, 02 Jun 2025 18:18:57 +0200
From: Eryk Kubanski <e.kubanski@...tner.samsung.com>
To: Maciej Fijalkowski <maciej.fijalkowski@...el.com>, Stanislav Fomichev
	<stfomichev@...il.com>
CC: "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"bjorn@...nel.org" <bjorn@...nel.org>, "magnus.karlsson@...el.com"
	<magnus.karlsson@...el.com>, "jonathan.lemon@...il.com"
	<jonathan.lemon@...il.com>
Subject: RE: Re: Re: [PATCH bpf v2] xsk: Fix out of order segment free in
 __xsk_generic_xmit()

> Eryk, can you tell us a bit more about HW you're using? The problem you
> described simply can not happen for HW with in-order completions. You
> can't complete descriptor from slot 5 without going through completion of
> slot 3. So our assumption is you're using HW with out-of-order
> completions, correct?

Maciej this isn't reproduced on any hardware.
I found this bug while working on generic AF_XDP.

We're using MACVLAN deployment where, two or more
sockets share single MACVLAN device queue.
It doesn't even need to go out of host...

SKB doesn't even need to complete in this case
to observe this bug. It's enough if earlier writer
just fails after descriptor write. This case is
writen in my diagram Notes 5).

Are you sure that __dev_direct_xmit will keep
the packets on the same thread? What's about
NAPI, XPS, IRQs, etc?

If sendmsg() is issued by two threads, you don't
know which one will complete faster. You can still
have out-of-order completion in relation to
descrpitor CQ write.

This isn't problem with out-of-order HW completion,
but the problem with out-of-order completion in relation
to sendmsg() call and descriptor write.

But this doesn't even need to be sent, as I
explained above, situation where one of threads
fails is more than enough to catch that bug.

> If that is the case then we have to think about possible solutions which
> probably won't be straight-forward. As Stan said current fix is a no-go.

Okay what is your idea? In my opinion the only
thing I can do is to just push the descriptors
before or after __dev_direct_xmit() and keep
these descriptors in some stack array.
However this won't be compatible with behaviour
of DRV deployed AF_XDP. Descriptors will be returned
right after copy to SKB instead of after SKB is sent.
If this is fine for you, It's fine for me.

Otherwise this need to be tied to SKB lifetime,
but how?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ