linux-kernel - Re: [PATCH 1/2] lockref: speculatively spin waiting for the lock to be released

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAHk-=wijfQyv1BMHj5CFqj6C4LKaEzq4b==+Y-V3HPgDO7HR3A@mail.gmail.com>
Date: Thu, 13 Jun 2024 09:50:12 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Mateusz Guzik <mjguzik@...il.com>
Cc: brauner@...nel.org, viro@...iv.linux.org.uk, jack@...e.cz, 
	linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH 1/2] lockref: speculatively spin waiting for the lock to
 be released

On Wed, 12 Jun 2024 at 23:10, Mateusz Guzik <mjguzik@...il.com> wrote:
>
> On Wed, Jun 12, 2024 at 06:23:18PM -0700, Linus Torvalds wrote:
> > So I have no problem with your patch 2/2 - moving the lockref data
> > structure away from everything else that can be shared read-only makes
> > a ton of sense independently of anything else.
> >
> > Except you also randomly increased a retry count in there, which makes no sense.
>
> Cmon man, that's a change which unintentionally crept into the patch and
> I failed to notice.

Heh. It wasn't entirely obvious that it was unintentional, since the
series very much was about that whole rety thing.

But good.

> I was playing with a bpftrace probe reporting how many spins were
> performed waiting for the lock. For my intentional usage with ls it was
> always < 30 or so. The random-ass intruder which messes with my bench on
> occasion needed over 100.

Ok. So keeping it at 100 is likely fine.

Of course, one option is to simply get rid of the retry count
entirely, and just make it all be entirely unconditional.

The fallback to locking is not technically required at all for the
USE_CMPXCHG_LOCKREF case.

That has some unfairness issues, though. But my point is mostly that
the count is probably not hugely important or meaningful.

> I tested your code and confirm it fixes the problem, nothing blows up
> either and I fwiw I don't see any bugs in it.

I was more worried about some fat-fingering than any huge conceptual
bugs, and any minor testing with performance checks would find that.

Just as an example: my first attempt switched the while(likely(..)) to
the if (unlikely(..)) in the loop, but didn't add the "!" to negate
the test.

I caught it immediately and obviously never sent that broken thing out
(and it was one reason why I decided I needed to make the code more
readable with that lockref_locked() helper macro). But that's mainly
the kind of thing I was worried about.

> When writing the commit message feel free to use mine in whatever capacity
> (including none) you want.

Ack.
> I think lockref claiming to be a general locking facility means it
> should not be suffering the degradation problem to begin with, so that
> would be the real problem as far as lockref goes.

Well, it was certainly originally meant to be generic, but after more
than a decade, the number of users is three. And the two non-dcache
users are pretty darn random.

So it's effectively really just a dcache special case with two
filesystems that started using it just because the authors had clearly
seen the vfs use of it...

> All that aside, you did not indicate how do you want to move forward
> regarding patch submission.

lockref is fairly unusual, and is pretty much mine. The initial
concept was from Waiman Long:

   https://lore.kernel.org/all/1372268603-46748-1-git-send-email-Waiman.Long@hp.com/

but I ended up redoing it pretty much from scratch, and credited him
as author for the initial commit.

It's _technically_ a locking thing, but realistically it has never
been locking tree and it's effectively a dcache thing.

           Linus