lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250210210645.GE554665@kernel.org>
Date: Mon, 10 Feb 2025 21:06:45 +0000
From: Simon Horman <horms@...nel.org>
To: Alexander Lobakin <aleksander.lobakin@...el.com>
Cc: Andrew Lunn <andrew+netdev@...n.ch>,
	"David S. Miller" <davem@...emloft.net>,
	Eric Dumazet <edumazet@...gle.com>,
	Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
	Alexei Starovoitov <ast@...nel.org>,
	Daniel Borkmann <daniel@...earbox.net>,
	John Fastabend <john.fastabend@...il.com>,
	Andrii Nakryiko <andrii@...nel.org>,
	"Jose E. Marchesi" <jose.marchesi@...cle.com>,
	Toke Høiland-Jørgensen <toke@...hat.com>,
	Magnus Karlsson <magnus.karlsson@...el.com>,
	Maciej Fijalkowski <maciej.fijalkowski@...el.com>,
	Przemek Kitszel <przemyslaw.kitszel@...el.com>,
	Jason Baron <jbaron@...mai.com>,
	Casey Schaufler <casey@...aufler-ca.com>,
	Nathan Chancellor <nathan@...nel.org>, bpf@...r.kernel.org,
	netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH net-next 4/4] xsk: add helper to get &xdp_desc's DMA and
 meta pointer in one go

On Mon, Feb 10, 2025 at 05:00:36PM +0100, Alexander Lobakin wrote:
> From: Simon Horman <horms@...nel.org>
> Date: Sun, 9 Feb 2025 11:03:44 +0000
> 
> > On Thu, Feb 06, 2025 at 07:26:29PM +0100, Alexander Lobakin wrote:
> >> Currently, when your driver supports XSk Tx metadata and you want to
> >> send an XSk frame, you need to do the following:
> >>
> >> * call external xsk_buff_raw_get_dma();
> >> * call inline xsk_buff_get_metadata(), which calls external
> >>   xsk_buff_raw_get_data() and then do some inline checks.
> >>
> >> This effectively means that the following piece:
> >>
> >> addr = pool->unaligned ? xp_unaligned_add_offset_to_addr(addr) : addr;
> >>
> >> is done twice per frame, plus you have 2 external calls per frame, plus
> >> this:
> >>
> >> 	meta = pool->addrs + addr - pool->tx_metadata_len;
> >> 	if (unlikely(!xsk_buff_valid_tx_metadata(meta)))
> >>
> >> is always inlined, even if there's no meta or it's invalid.
> >>
> >> Add xsk_buff_raw_get_ctx() (xp_raw_get_ctx() to be precise) to do that
> >> in one go. It returns a small structure with 2 fields: DMA address,
> >> filled unconditionally, and metadata pointer, non-NULL only if it's
> >> present and valid. The address correction is performed only once and
> >> you also have only 1 external call per XSk frame, which does all the
> >> calculations and checks outside of your hotpath. You only need to
> >> check `if (ctx.meta)` for the metadata presence.
> >> To not copy any existing code, derive address correction and getting
> >> virtual and DMA address into small helpers. bloat-o-meter reports no
> >> object code changes for the existing functionality.
> >>
> >> Signed-off-by: Alexander Lobakin <aleksander.lobakin@...el.com>
> > 
> > Hi Alexander,
> > 
> > I think that this patch needs to be accompanied by at least one
> > patch that uses xsk_buff_raw_get_ctx() in a driver.
> 
> This mini-series is the final part of my Chapter III, which was all
> about prereqs in order to add libeth_xdp and then XDP for idpf.
> This helper will be used in the next series (Chapter IV) I'll send once
> this lands.

Understood. If it's going to be used in chapter IV then, given
that we've made it to chapter II, that is fine by me.

> > Also, as this seems to be an optimisation, some performance data would
> > be nice too.
> 
> -1 Kb of object code which has an unrolled-by-8 loop which used this
> function each iteration. I don't remember the perf numbers since it was
> a year ago and since then I've been using this helper only, but it was
> something around a couple procent (which is several hundred Kpps when it
> comes to XSk).

Thanks. It might be worth including some of that information in the commit
message, but I don't feel strongly about it.

> 
> > 
> > Which brings me to my last point. I'd always understood that
> > returning a struct was discouraged due to performance implications.
> 
> Rather stack usage, not perf implications. Compound returns are used
> heavily throughout the kernel code when sizeof(result) <= 16 bytes.
> Here it's also 16 bytes. Just the same as one __u128. Plus this function
> doesn't recurse, so the stack won't blow up.

Also understood. It seems my assumptions were somewhat wrong.
So I have no objections to this approach.

> > Perhaps that information is out of date, doesn't apply because
> > the returned struct is so small in this case, or just plain wrong.
> > But I'd appreciate it if you could add some colour to this.
> 
> Moreover, the function is global, not inline, so passing a pointer here
> instead of returning a struct may even behave worse in this case.
> 
> (and we'll save basically only 8 bytes on the stack, which I don't
>  believe is worth it).
> 
> Thanks,
> Olek
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ