lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <81c1a391-3193-41c6-8ab7-c50c58684a22@intel.com>
Date: Wed, 20 Aug 2025 13:29:31 -0700
From: Jacob Keller <jacob.e.keller@...el.com>
To: Miroslav Lichvar <mlichvar@...hat.com>
CC: Kurt Kanzenbach <kurt@...utronix.de>, Tony Nguyen
	<anthony.l.nguyen@...el.com>, Przemek Kitszel <przemyslaw.kitszel@...el.com>,
	Andrew Lunn <andrew+netdev@...n.ch>, "David S. Miller" <davem@...emloft.net>,
	Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, "Paolo
 Abeni" <pabeni@...hat.com>, Richard Cochran <richardcochran@...il.com>,
	Vinicius Costa Gomes <vinicius.gomes@...el.com>, Sebastian Andrzej Siewior
	<bigeasy@...utronix.de>, <intel-wired-lan@...ts.osuosl.org>,
	<netdev@...r.kernel.org>
Subject: Re: [Intel-wired-lan] [PATCH iwl-next] igb: Retrieve Tx timestamp
 directly from interrupt



On 8/20/2025 12:56 AM, Miroslav Lichvar wrote:
> On Tue, Aug 19, 2025 at 04:31:49PM -0700, Jacob Keller wrote:
>> I'm having trouble interpreting what exactly this data shows, as its
>> quite a lot of data and numbers. I guess that it is showing when it
>> switches over to software timestamps.. It would be nice if ntpperf
>> showed number of events which were software vs hardware timestamping, as
>> thats likely the culprit. igb hardare only has a single outstanding Tx
>> timestamp at a time.
> 
> The server doesn't have a way to tell the client (ntpperf) which
> timestamps are HW or SW, we can only guess from the measured offset as
> HW timestamps should be more accurate, but on the server side the
> number of SW and HW TX timestamps provided to the client can be
> monitored with the "chronyc serverstats" command. The server requests
> both SW and HW TX timestamps and uses the better one it gets from the
> kernel, if it can actually get one before it receives the next
> request from the same client (ntpperf simulates up to 16384 concurrent
> clients).
> 
> When I run ntpperf at a fixed rate of 140000 requests per second
> for 10 seconds (-r 140000 -t 10), I get the following numbers.
> 
> Without the patch:
> NTP daemon TX timestamps   : 28056
> NTP kernel TX timestamps   : 1012864
> NTP hardware TX timestamps : 387239
> 
> With the patch:
> NTP daemon TX timestamps   : 28047
> NTP kernel TX timestamps   : 707674
> NTP hardware TX timestamps : 692326
> 
> The number of HW timestamps is significantly higher with the patch, so
> that looks good.
> 
> But when I increase the rate to 200000, I get this:
> 
> Without the patch:
> NTP daemon TX timestamps   : 35835
> NTP kernel TX timestamps   : 1410956
> NTP hardware TX timestamps : 581575            
> 
> With the patch:
> NTP daemon TX timestamps   : 476908
> NTP kernel TX timestamps   : 646146
> NTP hardware TX timestamps : 412095
> 

When does the NTP daemon decide to go with timestamping within the
daemon vs timestamping in the kernel? It seems odd that we don't achieve
100% kernel timestamps...

> With the patch, the server is now dropping requests and can provide
> a smaller number of HW timestamps and also a smaller number of SW
> timestamps, i.e. less work is done overall.
> 
> Could the explanation be that a single CPU core now needs to do more
> work, while it was better distributed before?
> 

Hm. The interrupt vector may be fired on the same CPU maybe? The work
items can go into the general pool which spreads to all CPUs, and I
guess the amount of work to submit the timestamp is high enough that we
do end up costing too much?

Hmm.

We could experiment with using a kthread via the ptp_aux_work setup and
tuning to ensure that thread has good prioritization? I don't know what
the best compromise is since its clear the interrupt is best if the
timestamp volume isn't too high.


Download attachment "OpenPGP_signature.asc" of type "application/pgp-signature" (237 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ