lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGXJAmzD7AR7vf54CVGHnRqwJnJFpO_aEQ5s8w-OdXjw0c8FKg@mail.gmail.com>
Date: Wed, 5 Feb 2025 15:56:36 -0800
From: John Ousterhout <ouster@...stanford.edu>
To: Andrew Lunn <andrew@...n.ch>
Cc: Paolo Abeni <pabeni@...hat.com>, netdev@...r.kernel.org, edumazet@...gle.com, 
	horms@...nel.org, kuba@...nel.org
Subject: Re: [PATCH net-next v6 08/12] net: homa: create homa_incoming.c

On Mon, Feb 3, 2025 at 9:58 AM Andrew Lunn <andrew@...n.ch> wrote:
>
> > > > If that happens then it could grab the lock instead of the desired
> > > > application, which would defeat the performance optimization and delay the
> > > > application a bit. This would be no worse than if the APP_NEEDS_LOCK
> > > > mechanism were not present.
> > >
> > > Then I suggest using plain unlock/lock() with no additional spinning in
> > > between.
> >
> > My concern here is that the unlock/lock sequence will happen so fast
> > that the other thread never actually has a chance to get the lock. I
> > will do some measurements to see what actually happens; if lock
> > ownership is successfully transferred in the common case without a
> > spin, then I'll remove it.
>
> https://docs.kernel.org/locking/mutex-design.html
>
> If there is a thread waiting for the lock, it will spin for a while
> trying to acquire it. The document also mentions that when there are
> multiple waiters, the algorithm tries to be fair. So if there is a
> fast unlock/lock, it should act fairly with the other waiter.

The link above refers to mutexes, whereas the code in question uses spinlocks.

I spent some time today doing measurements, and here's what I found.

* Without the call to homa_spin the handoff fails 20-25% of the time
(i.e., the releasing thread reacquires the lock before the "needy"
thread can get it).

* With the call to homa_spin the handoff fails 0.3-1% of the time.
This happens because of delays in the needy thread, typically an
interrupt that keeps it from retrying the lock quickly. This surprised
me as I thought that interrupts  were disabled by spinlocks, but I
definitely see the interrupts happening; maybe only *some* interrupts
(softirqs?) are disabled by spinlocks?

* I tried varying the length of the spin to see how that affects the
handoff failure rate. In case you're curious:

200ns             0.3-1.0%
100ns             0.4-1.0%
50ns              0.4-1.6%
20ns              1.3-3.9%
10ns              3.3-6.4%

* Note: the call to homa_spin is "free" in cases where the lock is
successfully handed off, since the thread that calls homa_spin will
attempt to reacquire the spinlock, and the lock won't become free
again until well after homa_spin has returned (without the call to
homa_spin the thread just spends more time spinning for the lock). It
only adds overhead in the (rare) case of a handoff failure.

* Interestingly, the lock transfer seems to happen a bit faster with
the homa_spin call than without it. I measured transfer times (time
from when one thread releases the lock until the other thread acquires
it) of 205-225 ns with the call to homa_spin, and 220-250 ns without
the call to homa_spin. This improvement in the common case where the
transfer succeeds more than compensates for the 100ns of wasted time
when the transfer fails.

Based on all of this, I'm going to keep the call to homa_spin but
reduce the spin time to 100ns (I want to leave some leeway in case
there is variation between architectures in how long it takes the
needy thread to grab the lock). I have fleshed out the comment next to
the code to provide more information about the benefits and to make it
clear that the benefits have been measured, not just hypothesized.

-John-

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ