netdev - Re: [PATCH net-next v15 03/15] net: homa: create shared Homa header files

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <fd3b25a3-018b-4732-af42-289b3c7c4817@redhat.com>
Date: Fri, 29 Aug 2025 09:53:01 +0200
From: Paolo Abeni <pabeni@...hat.com>
To: John Ousterhout <ouster@...stanford.edu>
Cc: "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
 Eric Dumazet <edumazet@...gle.com>, Simon Horman <horms@...nel.org>,
 Jakub Kicinski <kuba@...nel.org>
Subject: Re: [PATCH net-next v15 03/15] net: homa: create shared Homa header
 files

On 8/29/25 5:03 AM, John Ousterhout wrote:
> On Wed, Aug 27, 2025 at 12:21 AM Paolo Abeni <pabeni@...hat.com> wrote:
> 
>> The TSC raw value depends on the current CPU.
> 
> This is incorrect. There were problems in the first multi-core Intel
> chips in the early 2000s, but they were fixed before I began using TSC
> in 2010. The TSC counter is synchronized across cores and increments
> at a constant rate independent of core frequency and power state.

Please read:

https://elixir.bootlin.com/linux/v6.17-rc3/source/arch/x86/include/asm/tsc.h#L14

> You didn't answer my question about which time source I should use,
> but after poking around a bit it looks like ktime_get_ns is the best
> option?

yes, ktime_get_ns()

> I have measured Homa performance using ktime_get_ns, and
> this adds about .04 core to Homa's total core utilization when driving
> a 25 Gbps link at 80% utilization bidirectional. 

What is that 0.04? A percent? of total CPU time? of CPU time used by
Homa? absolute time?

If that is percent of total CPU time for a single core, such value is
inconsistent with my benchmarking where a couple of timestamp() reads
per aggregate packet are well below noise level.

> I expect the overhead
> to scale with network bandwidth, 

Actually it could not if the protocol does proper aggregation.

> so I would expect the overhead to be
> 0.16 core at 100 Gbps. I consider this overhead to be significant, but
> I have modified homa_clock to use ktime_get_ns in the upstreamed
> version.

My not so wild guess is that other bottlenecks will hit much more, much
earlier.

/P