lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <dcb3053f-6588-4c87-be42-a172dacb1828@gmail.com>
Date: Fri, 16 May 2025 16:47:06 +0300
From: Tariq Toukan <ttoukan.linux@...il.com>
To: Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc: "David S. Miller" <davem@...emloft.net>, Jakub Kicinski
 <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
 Eric Dumazet <edumazet@...gle.com>, Andrew Lunn <andrew+netdev@...n.ch>,
 Saeed Mahameed <saeedm@...dia.com>, Leon Romanovsky <leon@...nel.org>,
 Alexei Starovoitov <ast@...nel.org>, Daniel Borkmann <daniel@...earbox.net>,
 Jesper Dangaard Brouer <hawk@...nel.org>,
 John Fastabend <john.fastabend@...il.com>,
 Network Development <netdev@...r.kernel.org>, linux-rdma@...r.kernel.org,
 LKML <linux-kernel@...r.kernel.org>, bpf <bpf@...r.kernel.org>,
 Moshe Shemesh <moshe@...dia.com>, Mark Bloch <mbloch@...dia.com>,
 Gal Pressman <gal@...dia.com>, Carolina Jubran <cjubran@...dia.com>,
 Sebastiano Miano <mianosebastiano@...il.com>,
 Samuel Dobron <sdobron@...hat.com>
Subject: Re: [PATCH net-next] net/mlx5e: Reuse per-RQ XDP buffer to avoid
 stack zeroing overhead



On 15/05/2025 3:26, Alexei Starovoitov wrote:
> On Wed, May 14, 2025 at 1:04 PM Tariq Toukan <tariqt@...dia.com> wrote:
>>
>> From: Carolina Jubran <cjubran@...dia.com>
>>
>> CONFIG_INIT_STACK_ALL_ZERO introduces a performance cost by
>> zero-initializing all stack variables on function entry. The mlx5 XDP
>> RX path previously allocated a struct mlx5e_xdp_buff on the stack per
>> received CQE, resulting in measurable performance degradation under
>> this config.
>>
>> This patch reuses a mlx5e_xdp_buff stored in the mlx5e_rq struct,
>> avoiding per-CQE stack allocations and repeated zeroing.
>>
>> With this change, XDP_DROP and XDP_TX performance matches that of
>> kernels built without CONFIG_INIT_STACK_ALL_ZERO.
>>
>> Performance was measured on a ConnectX-6Dx using a single RX channel
>> (1 CPU at 100% usage) at ~50 Mpps. The baseline results were taken from
>> net-next-6.15.
>>
>> Stack zeroing disabled:
>> - XDP_DROP:
>>      * baseline:                     31.47 Mpps
>>      * baseline + per-RQ allocation: 32.31 Mpps (+2.68%)
>>
>> - XDP_TX:
>>      * baseline:                     12.41 Mpps
>>      * baseline + per-RQ allocation: 12.95 Mpps (+4.30%)
> 
> Looks good, but where are these gains coming from ?
> The patch just moves mxbuf from stack to rq.
> The number of operations should really be the same.
> 

I guess it's cache related. Hot/cold areas, alignments, movement of 
other fields in the mlx5e_rq structure...

>> Stack zeroing enabled:
>> - XDP_DROP:
>>      * baseline:                     24.32 Mpps
>>      * baseline + per-RQ allocation: 32.27 Mpps (+32.7%)
> 
> This part makes sense.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ