lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <87plbd361o.fsf@cloudflare.com>
Date: Fri, 26 Sep 2025 11:40:51 +0200
From: Jakub Sitnicki <jakub@...udflare.com>
To: Jacob Keller <jacob.e.keller@...el.com>
Cc: Michal Kubiak <michal.kubiak@...el.com>,
  <intel-wired-lan@...ts.osuosl.org>,  <maciej.fijalkowski@...el.com>,
  <aleksander.lobakin@...el.com>,  <larysa.zaremba@...el.com>,
  <netdev@...r.kernel.org>,  <przemyslaw.kitszel@...el.com>,
  <pmenzel@...gen.mpg.de>,  <anthony.l.nguyen@...el.com>,
 kernel-team@...udflare.com
Subject: Re: [PATCH iwl-next v3 0/3] ice: convert Rx path to Page Pool

On Thu, Sep 25, 2025 at 10:22 AM -07, Jacob Keller wrote:
> On 9/25/2025 2:56 AM, Jakub Sitnicki wrote:
>> On Thu, Sep 25, 2025 at 11:22 AM +02, Michal Kubiak wrote:
>>> This series modernizes the Rx path in the ice driver by removing legacy
>>> code and switching to the Page Pool API. The changes follow the same
>>> direction as previously done for the iavf driver, and aim to simplify
>>> buffer management, improve maintainability, and prepare for future
>>> infrastructure reuse.
>>>
>>> An important motivation for this work was addressing reports of poor
>>> performance in XDP_TX mode when IOMMU is enabled. The legacy Rx model
>>> incurred significant overhead due to per-frame DMA mapping, which
>>> limited throughput in virtualized environments. This series eliminates
>>> those bottlenecks by adopting Page Pool and bi-directional DMA mapping.
>>>
>>> The first patch removes the legacy Rx path, which relied on manual skb
>>> allocation and header copying. This path has become obsolete due to the
>>> availability of build_skb() and the increasing complexity of supporting
>>> features like XDP and multi-buffer.
>>>
>>> The second patch drops the page splitting and recycling logic. While
>>> once used to optimize memory usage, this logic introduced significant
>>> complexity and hotpath overhead. Removing it simplifies the Rx flow and
>>> sets the stage for Page Pool adoption.
>>>
>>> The final patch switches the driver to use the Page Pool and libeth
>>> APIs. It also updates the XDP implementation to use libeth_xdp helpers
>>> and optimizes XDP_TX by avoiding per-frame DMA mapping. This results in
>>> a significant performance improvement in virtualized environments with
>>> IOMMU enabled (over 5x gain in XDP_TX throughput). In other scenarios,
>>> performance remains on par with the previous implementation.
>>>
>>> This conversion also aligns with the broader effort to modularize and
>>> unify XDP support across Intel Ethernet drivers.
>>>
>>> Tested on various workloads including netperf and XDP modes (PASS, DROP,
>>> TX) with and without IOMMU. No regressions observed.
>> 
>> Will we be able to have 256 B of XDP headroom after this conversion?
>> 
>> Thanks,
>> -jkbs
>
> We should. The queues are configured through libeth, and set the xdp
> field if its enabled on that ring:
>
>> @@ -622,8 +589,14 @@ static unsigned int ice_get_frame_sz(struct ice_rx_ring *rx_ring)
>>   */
>>  static int ice_vsi_cfg_rxq(struct ice_rx_ring *ring)
>>  {
>> +	struct libeth_fq fq = {
>> +		.count		= ring->count,
>> +		.nid		= NUMA_NO_NODE,
>> +		.xdp		= ice_is_xdp_ena_vsi(ring->vsi),
>> +		.buf_len	= LIBIE_MAX_RX_BUF_LEN,
>> +	};
>
>
> If .xdp is set, then the libeth Rx configuration reserves
> LIBETH_XDP_HEADROOM, which is XDP_PACKET_HEADROOM aligned to
> NET_SKB_PAD, + an extra NET_IP_ALIGN, which results in 258 bytes of
> headroom reserved.

That's great news. We've been observing a growing adoption of custom XDP
metadata ([1], [2]) at Cloudflare, so the current 192B of headroom in
ICE was limiting.

[1] https://docs.ebpf.io/linux/helper-function/bpf_xdp_adjust_meta/
[2] https://docs.kernel.org/networking/xdp-rx-metadata.html#af-xdp

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ