lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c50d94ff-cd22-426a-8ede-21d9d045e2a1@suse.cz>
Date: Mon, 31 Mar 2025 15:11:13 +0200
From: Vlastimil Babka <vbabka@...e.cz>
To: Linus Torvalds <torvalds@...ux-foundation.org>,
 Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc: bpf <bpf@...r.kernel.org>, Daniel Borkmann <daniel@...earbox.net>,
 Andrii Nakryiko <andrii@...nel.org>, Martin KaFai Lau
 <martin.lau@...nel.org>, Andrew Morton <akpm@...ux-foundation.org>,
 Peter Zijlstra <peterz@...radead.org>,
 Sebastian Sewior <bigeasy@...utronix.de>,
 Steven Rostedt <rostedt@...dmis.org>, Michal Hocko <mhocko@...e.com>,
 Shakeel Butt <shakeel.butt@...ux.dev>, linux-mm <linux-mm@...ck.org>,
 LKML <linux-kernel@...r.kernel.org>, Johannes Weiner <hannes@...xchg.org>
Subject: Re: [GIT PULL] Introduce try_alloc_pages for 6.15

On 3/31/25 00:08, Linus Torvalds wrote:
> On Sun, 30 Mar 2025 at 14:30, Alexei Starovoitov
> <alexei.starovoitov@...il.com> wrote:
>>
>> But to avoid being finger pointed, I'll switch to checking alloc_flags
>> first. It does seem a better trade off to avoid cache bouncing because
>> of 2nd cmpxchg. Though when I wrote it this way I convinced myself and
>> others that it's faster to do trylock first to avoid branch misprediction.
> 
> Yes, the really hot paths (ie core locking) do the "trylock -> read
> spinning" for that reason. Then for the normal case, _only_ the
> trylock is in the path, and that's the best of both worlds.

I've been wondering if spin locks could expose the contended slowpath so we
can trylock, and on failure do the check and then call the slowpath directly
that doesn't include another trylock.

It would also be nice if the trylock part could become inline and only the
slowpath would be a function call - even during normal spin_lock_*()
operation. AFAIK right now everything is a function call on x86_64. Not sure
how feasible would that be with the alternatives and paravirt stuff we do.

> And in practice, the "do two compare-and-exchange" operations actually
> does work fine, because the cacheline will generally be sticky enough
> that you don't actually get many extra cachline bouncing.
> 
> So I'm not sure it matters in the end, but I did react to it.
> 
>              Linus


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ