lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGXJAmzpibzh+4FvM4mcvkXeT8f0AhMK00eqie7J8NEU9Z9xWg@mail.gmail.com>
Date: Thu, 28 Aug 2025 20:03:25 -0700
From: John Ousterhout <ouster@...stanford.edu>
To: Paolo Abeni <pabeni@...hat.com>
Cc: netdev@...r.kernel.org, edumazet@...gle.com, horms@...nel.org, 
	kuba@...nel.org
Subject: Re: [PATCH net-next v15 03/15] net: homa: create shared Homa header files

On Wed, Aug 27, 2025 at 12:21 AM Paolo Abeni <pabeni@...hat.com> wrote:

> The TSC raw value depends on the current CPU.

This is incorrect. There were problems in the first multi-core Intel
chips in the early 2000s, but they were fixed before I began using TSC
in 2010. The TSC counter is synchronized across cores and increments
at a constant rate independent of core frequency and power state.

You didn't answer my question about which time source I should use,
but after poking around a bit it looks like ktime_get_ns is the best
option? Please let me know if there are any problems with using this.
Interestingly, ktime_get_ns actually uses TSC (RDTSCP) on Intel
platforms. ktime_get_ns takes about 14 ns per call, vs. about 8 ns for
get_cycles. I have measured Homa performance using ktime_get_ns, and
this adds about .04 core to Homa's total core utilization when driving
a 25 Gbps link at 80% utilization bidirectional. I expect the overhead
to scale with network bandwidth, so I would expect the overhead to be
0.16 core at 100 Gbps. I consider this overhead to be significant, but
I have modified homa_clock to use ktime_get_ns in the upstreamed
version.

> > If not, I would prefer to retain
> > the use of TSC until someone can identify a real problem. Note that
> > the choice of clock is now well encapsulated, so if a change should
> > become necessary it will be very easy to make.
>
> AFAICS, in the current revision there are several points that could
> cause much greater latency - i.e. the long loops under BH lock with no
> reschedule. I'm surprised they don't show as ms-latency bottle-necks
> under stress test.

If you see "long loops under BH lock with no reschedule" please let me
know and I will try to fix them. My goal is to avoid such things, but
I may have missed something.

-John-

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ