lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 28 Sep 2023 10:43:21 -0400
From:   Steven Rostedt <rostedt@...dmis.org>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        linux-kernel@...r.kernel.org, Thomas Gleixner <tglx@...utronix.de>,
        "Paul E . McKenney" <paulmck@...nel.org>,
        Boqun Feng <boqun.feng@...il.com>,
        "H . Peter Anvin" <hpa@...or.com>, Paul Turner <pjt@...gle.com>,
        linux-api@...r.kernel.org, Christian Brauner <brauner@...nel.org>,
        Florian Weimer <fw@...eb.enyo.de>, David.Laight@...lab.com,
        carlos@...hat.com, Peter Oskolkov <posk@...k.io>,
        Alexander Mikhalitsyn <alexander@...alicyn.com>,
        Chris Kennelly <ckennelly@...gle.com>,
        Ingo Molnar <mingo@...hat.com>,
        Darren Hart <dvhart@...radead.org>,
        Davidlohr Bueso <dave@...olabs.net>,
        André Almeida <andrealmeid@...lia.com>,
        libc-alpha@...rceware.org, Jonathan Corbet <corbet@....net>,
        Noah Goldstein <goldstein.w.n@...il.com>,
        Daniel Colascione <dancol@...gle.com>, longman@...hat.com,
        Florian Weimer <fweimer@...hat.com>
Subject: Re: [RFC PATCH v2 1/4] rseq: Add sched_state field to struct rseq

On Thu, 28 Sep 2023 12:39:26 +0200
Peter Zijlstra <peterz@...radead.org> wrote:

> As always, are syscalls really *that* expensive? Why can't we busy wait
> in the kernel instead?

Yes syscalls are that expensive. Several years ago I had a good talk
with Robert Haas (one of the PostgreSQL maintainers) at Linux Plumbers,
and I asked him if they used futexes. His answer was "no". He told me
how they did several benchmarks and it was a huge performance hit (and
this was before Spectre/Meltdown made things much worse). He explained
to me that most locks are taken just to flip a few bits. Going into the
kernel and coming back was orders of magnitude longer than the critical
sections. By going into the kernel, it caused a ripple effect and lead
to even more contention. There answer was to implement their locking
completely in user space without any help from the kernel.

This is when I thought that having an adaptive spinner that could get
hints from the kernel via memory mapping would be extremely useful.

The obvious problem with their implementation is that if the owner is
sleeping, there's no point in spinning. Worse, the owner may even be
waiting for the spinner to get off the CPU before it can run again. But
according to Robert, the gain in the general performance greatly
outweighed the few times this happened in practice.

But still, if userspace could figure out if the owner is running on
another CPU or not, to act just like the adaptive mutexes in the
kernel, that would prevent the problem of a spinner keeping the owner
from running.

-- Steve

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ