lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <362d84f1-5adf-4fae-a826-01c39f891f1e@redhat.com>
Date: Thu, 9 Jan 2025 11:09:02 +0100
From: Paolo Abeni <pabeni@...hat.com>
To: Furong Xu <0x1207@...il.com>, Jason Xing <kerneljasonxing@...il.com>
Cc: netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
 Jesper Dangaard Brouer <hawk@...nel.org>,
 Ilias Apalodimas <ilias.apalodimas@...aro.org>,
 "David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
 Jakub Kicinski <kuba@...nel.org>, Simon Horman <horms@...nel.org>
Subject: Re: [PATCH net-next v3] page_pool: check for dma_sync_size earlier

On 1/6/25 4:31 AM, Furong Xu wrote:
> On Mon, 6 Jan 2025 11:15:45 +0800, Jason Xing <kerneljasonxing@...il.com> wrote:
> 
>> On Mon, Jan 6, 2025 at 11:02 AM Furong Xu <0x1207@...il.com> wrote:
>>>
>>> Setting dma_sync_size to 0 is not illegal, fec_main.c and ravb_main.c
>>> already did.
>>> We can save a couple of function calls if check for dma_sync_size earlier.
>>>
>>> This is a micro optimization, about 0.6% PPS performance improvement
>>> has been observed on a single Cortex-A53 CPU core with 64 bytes UDP RX
>>> traffic test.
>>>
>>> Before this patch:
>>> The average of packets per second is 234026 in one minute.
>>>
>>> After this patch:
>>> The average of packets per second is 235537 in one minute.  
>>
>> Sorry, I keep skeptical that this small improvement can be statically
>> observed? What exact tool or benchmark are you using, I wonder?
> 
> A x86 PC send out UDP packet and the sar cmd from Sysstat package to report
> the PPS on RX side:
> sar -n DEV 60 1

I agree with Jason: in my experience this kind of delta on UDP pps tests
is quite below the noise level.

I suggest to do a micro-benchmarking, measuring the CPU cycles required
for whole page_pool_dma_sync_for_device() call via get_cycles(), on
vanilla and with your patch - assuming the arch you have handy supports it.

The delta in such testing should be significant.

Thanks,

Paolo


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ