lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 13 Jun 2024 20:41:23 +0200
From: Mateusz Guzik <mjguzik@...il.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Christian Brauner <brauner@...nel.org>, viro@...iv.linux.org.uk, jack@...e.cz, 
	linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH 1/2] lockref: speculatively spin waiting for the lock to
 be released

On Thu, Jun 13, 2024 at 8:13 PM Mateusz Guzik <mjguzik@...il.com> wrote:
>
> On Thu, Jun 13, 2024 at 7:00 PM Linus Torvalds
> <torvalds@...ux-foundation.org> wrote:
> >
> > On Thu, 13 Jun 2024 at 06:46, Christian Brauner <brauner@...nel.org> wrote:
> > >
> > > I've picked Linus patch and your for testing into the vfs.inode.rcu branch.
> >
> > Btw, if you added [patch 2/2] too, I think the exact byte counts in
> > the comments are a bit misleading.
> >
> > The actual cacheline details will very much depend on 32-bit vs 64-bit
> > builds, but also on things like the slab allocator debug settings.
> >
>
> I indeed assumed "x86_64 production", with lines just copied from pahole.
>
> However, to the best of my understanding the counts are what one
> should expect on a 64-bit kernel.
>
> That said:
>
> > I think the important part is to keep the d_lockref - that is often
> > dirty and exclusive in the cache - away from the mostly RO parts of
> > the dentry that can be shared across CPU's in the cache.
> >
> > So rather than talk about exact byte offsets, maybe just state that
> > overall rule?
> >
>
> I would assume the rule is pretty much well known and instead an
> indicator where is what (as added in my comments) would be welcome.
>
> But if going that route, then perhaps:
> "Make sure d_lockref does not share a cacheline with fields used by
> RCU lookup, otherwise it can result in a signification throughput
> reduction. You can use pahole(1) to check the layout."
> [maybe a link to perfbook or something goes here?]
>
> Arguably a bunch of BUILD_BUG_ONs could be added to detect the overlap
> (only active without whatever runtime debug messing with the layout).
>

So I tried to add the BUILD_BUG_ONs but I got some compilation errors
immediately concerning the syntax, maybe I'm brainfarting here. I am
not pasting that in case you want to take a stab yourself from
scratch.

I did however type up the following:
/*
 * Note: dentries have a read-mostly area heavily used by RCU (denoted with
 * "RCU lookup touched fields") which must not share cachelines with a
 * frequently written-to field like d_lockref.
 *
 * If you are modifying the layout you can check with pahole(1) that there is
 * no overlap.
 */

could land just above the struct.

That's my $0,03. I am not going to give further commentary on the
matter, you guys touch it up however you see fit.

-- 
Mateusz Guzik <mjguzik gmail.com>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ