[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <24f1f083-92cd-4576-ac1f-2c53861ebc3f@intel.com>
Date: Thu, 21 Aug 2025 09:27:22 -0700
From: Jacob Keller <jacob.e.keller@...el.com>
To: Jesper Dangaard Brouer <hawk@...nel.org>, Tony Nguyen
<anthony.l.nguyen@...el.com>, <ast@...nel.org>, <davem@...emloft.net>,
<kuba@...nel.org>, <pabeni@...hat.com>, <netdev@...r.kernel.org>
CC: <maciej.fijalkowski@...el.com>, <magnus.karlsson@...el.com>,
<andrew+netdev@...n.ch>, <daniel@...earbox.net>, <john.fastabend@...il.com>,
<sdf@...ichev.me>, <bpf@...r.kernel.org>, <horms@...nel.org>,
<przemyslaw.kitszel@...el.com>, <aleksander.lobakin@...el.com>,
<jaroslav.pulchart@...ddata.com>, <jdamato@...tly.com>,
<christoph.petrausch@...pl.com>, Rinitha S <sx.rinitha@...el.com>, "Priya
Singh" <priyax.singh@...el.com>, Eelco Chaudron <echaudro@...hat.com>,
<edumazet@...gle.com>
Subject: Re: [PATCH net 3/6] ice: fix Rx page leak on multi-buffer frames
On 8/19/2025 12:53 PM, Jacob Keller wrote:
>
>
> On 8/19/2025 9:44 AM, Jesper Dangaard Brouer wrote:
>>
>>
>> On 19/08/2025 02.38, Jacob Keller wrote:
>>>
>>>
>>> On 8/18/2025 4:05 AM, Jesper Dangaard Brouer wrote:
>>>> On 15/08/2025 22.41, Tony Nguyen wrote:
>>>>> This has the advantage that we also no longer need to track or cache the
>>>>> number of fragments in the rx_ring, which saves a few bytes in the ring.
>>>>>
>>>>
>>>> Have anyone tested the performance impact for XDP_DROP ?
>>>> (with standard non-multi-buffer frames)
>>>>
>>>> Below code change will always touch cache-lines in shared_info area.
>>>> Before it was guarded with a xdp_buff_has_frags() check.
>>>>
>>>
>>> I did some basic testing with XDP_DROP previously using the xdp-bench
>>> tool, and do not recall notice an issue. I don't recall the actual
>>> numbers now though, so I did some quick tests again.
>>>
>>> without patch...
>>>
>>> Client:
>>> $ iperf3 -u -c 192.168.93.1 -t86400 -l1200 -P20 -b5G
>>> ...
>>> [SUM] 10.00-10.33 sec 626 MBytes 16.0 Gbits/sec 546909
>>>
>>> $ iperf3 -s -B 192.168.93.1%ens260f0
>>> [SUM] 0.00-10.00 sec 17.7 GBytes 15.2 Gbits/sec 0.011 ms
>>> 9712/15888183 (0.061%) receiver
>>>
>>> $ xdp-bench drop ens260f0
>>> Summary 1,778,935 rx/s 0 err/s
>>> Summary 2,041,087 rx/s 0 err/s
>>> Summary 2,005,052 rx/s 0 err/s
>>> Summary 1,918,967 rx/s 0 err/s
>>>
>>> with patch...
>>>
>>> Client:
>>> $ iperf3 -u -c 192.168.93.1 -t86400 -l1200 -P20 -b5G
>>> ...
>>> [SUM] 78.00-78.90 sec 2.01 GBytes 19.1 Gbits/sec 1801284
>>>
>>> Server:
>>> $ iperf3 -s -B 192.168.93.1%ens260f0
>>> [SUM] 77.00-78.00 sec 2.14 GBytes 18.4 Gbits/sec 0.012 ms
>>> 9373/1921186 (0.49%)
>>>
>>> xdp-bench:
>>> $ xdp-bench drop ens260f0
>>> Dropping packets on ens260f0 (ifindex 8; driver ice)
>>> Summary 1,910,918 rx/s 0 err/s
>>> Summary 1,866,562 rx/s 0 err/s
>>> Summary 1,901,233 rx/s 0 err/s
>>> Summary 1,859,854 rx/s 0 err/s
>>> Summary 1,593,493 rx/s 0 err/s
>>> Summary 1,891,426 rx/s 0 err/s
>>> Summary 1,880,673 rx/s 0 err/s
>>> Summary 1,866,043 rx/s 0 err/s
>>> Summary 1,872,845 rx/s 0 err/s
>>>
>>>
>>> I ran a few times and it seemed to waffle a bit around 15Gbit/sec to
>>> 20Gbit/sec, with throughput varying regardless of which patch applied. I
>>> actually tended to see slightly higher numbers with this fix applied,
>>> but it was not consistent and hard to measure.
>>>
>>
>> Above testing is not a valid XDP_DROP test.
>>
>
> Fair. I'm no XDP expert, so I have a lot to learn here :)
>
>> The packet generator need to be much much faster, as XDP_DROP is for
>> DDoS protection use-cases (one of Cloudflare's main products).
>>
>> I recommend using the script for pktgen in kernel tree:
>> samples/pktgen/pktgen_sample03_burst_single_flow.sh
>>
>> Example:
>> ./pktgen_sample03_burst_single_flow.sh -vi mlx5p2 -d 198.18.100.1 -m
>> b4:96:91:ad:0b:09 -t $(nproc)
>>
>>
>>> without the patch:
>>
>> On my testlab with CPU: AMD EPYC 9684X (SRSO=IBPB) running:
>> - sudo ./xdp-bench drop ice4 # (defaults to no-touch)
>>
>> XDP_DROP (with no-touch)
>> Without patch : 54,052,300 rx/s = 18.50 nanosec/packet
>> With the patch: 33,420,619 rx/s = 29.92 nanosec/packet
>> Diff: 11.42 nanosec
>>
>
> Oof. Yea, thats not good.
>
>> Using perf stat I can see an increase in cache-misses.
>>
>> The difference is less, if we read-packet data, running:
>> - sudo ./xdp-bench drop ice4 --packet-operation read-data
>>
>> XDP_DROP (with read-data)
>> Without patch : 27,200,683 rx/s = 36.76 nanosec/packet
>> With the patch: 24,348,751 rx/s = 41.07 nanosec/packet
>> Diff: 4.31 nanosec
>>
>> On this CPU we don't have DDIO/DCA, so we take a big hit reading the
>> packet data in XDP. This will be needed by our DDoS bpf_prog.
>> The nanosec diff isn't the same, so it seem this change can hide a
>> little behind the cache-miss in the XDP bpf_prog.
>>
>>
>>> Without xdp-bench running the XDP program, top showed a CPU usage of
>>> 740% and an ~86 idle score.
>>>
>>
>> We don't want a scaling test for this. For this XDP_DROP/DDoS test we
>> want to target a single CPU. This is easiest done by generating a single
>> flow (hint pktgen script is called _single_flow). We want to see a
>> single CPU running ksoftirqd 100% of the time.
>>
>
> Ok.
>
>>>
>>> I'm not certain, but reading the helpers it might be correct to do
>>> something like this:
>>>
>>> if (unlikely(xdp_buff_has_frags(xdp)))
>>> nr_frags = xdp_get_shared_info_from_buff(xdp)->nr_frags;
>>> else
>>> nr_frags = 1
>>
>> Yes, that looks like a correct pattern.
>>
It looks like i40e has the same mistake, but perhaps its less impacted
because of lower network speeds.
This mistake crept in because the i40e_process_rx_buffs (which I
borrowed the same logic from) unconditionally checks the shared info for
the nr_frags.
In actuality, this counts the number of fragments not counting the
initial descriptor, but the check in the loop body is aware of and
accounts for that.
Thus, I think what we really want here is to set nr_frags to 0 if
xdp_buff_has_frags() is false, not 1. A helper function seems like the
best solution, and I can submit a change to i40e to fix that code,
assuming I can measure the difference there as well.
Thanks,
Jake
Download attachment "OpenPGP_signature.asc" of type "application/pgp-signature" (237 bytes)
Powered by blists - more mailing lists