linux-kernel - Re: [PATCH 1/2] lockref: speculatively spin waiting for the lock to be released

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAHk-=wgX9UZXWkrhnjcctM8UpDGQqWyt3r=KZunKV3+00cbF9A@mail.gmail.com>
Date: Wed, 12 Jun 2024 18:23:18 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Mateusz Guzik <mjguzik@...il.com>
Cc: brauner@...nel.org, viro@...iv.linux.org.uk, jack@...e.cz, 
	linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH 1/2] lockref: speculatively spin waiting for the lock to
 be released

On Wed, 12 Jun 2024 at 17:12, Mateusz Guzik <mjguzik@...il.com> wrote:
>
> While I did not try to figure out who transiently took the lock (it was
> something outside of the benchmark), I devised a trivial reproducer
> which triggers the problem almost every time: merely issue "ls" of the
> directory containing the tested file (in this case: "ls /tmp").

So I have no problem with your patch 2/2 - moving the lockref data
structure away from everything else that can be shared read-only makes
a ton of sense independently of anything else.

Except you also randomly increased a retry count in there, which makes no sense.

But this patch 1/2 makes me go "Eww, hacky hacky".

We already *have* the retry loop, it's just that currently it only
covers the cmpxchg failures.

The natural thing to do is to just make the "wait for unlocked" be
part of the same loop.

In fact, I have this memory of trying this originally, and it not
mattering and just making the code uglier, but that may be me
confusing myself. It's a *loong* time ago.

With the attached patch, lockref_get() (to pick one random case) ends
up looking like this:

        mov    (%rdi),%rax
        mov    $0x64,%ecx
  loop:
        test   %eax,%eax
        jne    locked
        mov    %rax,%rdx
        sar    $0x20,%rdx
        add    $0x1,%edx
        shl    $0x20,%rdx
        lock cmpxchg %rdx,(%rdi)
        jne    fail
        // SUCCESS
        ret
  locked:
        pause
        mov    (%rdi),%rax
  fail:
        sub    $0x1,%ecx
        jne    loop

(with the rest being the "take lock and go slow" case).

It seems much better to me to have *one* retry loop that handles both
the causes of failures.

Entirely untested, I only looked at the generated code and it looked
reasonable. The patch may be entirely broken for some random reason I
didn't think of.

And in case you wonder, that 'lockref_locked()' macro I introduce is
purely to make the code more readable. Without it, that one
conditional line ends up being insanely long, the macro is there just
to break things up into slightly more manageable chunks.

Mind testing this approach instead?

                 Linus

View attachment "patch.diff" of type "text/x-patch" (1123 bytes)