lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGudoHHWL_CftUXyeZNU96qHsi5DT_OTL5ZLOWoCGiB45HvzVA@mail.gmail.com>
Date: Thu, 13 Jun 2024 20:13:10 +0200
From: Mateusz Guzik <mjguzik@...il.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Christian Brauner <brauner@...nel.org>, viro@...iv.linux.org.uk, jack@...e.cz, 
	linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH 1/2] lockref: speculatively spin waiting for the lock to
 be released

On Thu, Jun 13, 2024 at 7:00 PM Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
>
> On Thu, 13 Jun 2024 at 06:46, Christian Brauner <brauner@...nel.org> wrote:
> >
> > I've picked Linus patch and your for testing into the vfs.inode.rcu branch.
>
> Btw, if you added [patch 2/2] too, I think the exact byte counts in
> the comments are a bit misleading.
>
> The actual cacheline details will very much depend on 32-bit vs 64-bit
> builds, but also on things like the slab allocator debug settings.
>

I indeed assumed "x86_64 production", with lines just copied from pahole.

However, to the best of my understanding the counts are what one
should expect on a 64-bit kernel.

That said:

> I think the important part is to keep the d_lockref - that is often
> dirty and exclusive in the cache - away from the mostly RO parts of
> the dentry that can be shared across CPU's in the cache.
>
> So rather than talk about exact byte offsets, maybe just state that
> overall rule?
>

I would assume the rule is pretty much well known and instead an
indicator where is what (as added in my comments) would be welcome.

But if going that route, then perhaps:
"Make sure d_lockref does not share a cacheline with fields used by
RCU lookup, otherwise it can result in a signification throughput
reduction. You can use pahole(1) to check the layout."
[maybe a link to perfbook or something goes here?]

Arguably a bunch of BUILD_BUG_ONs could be added to detect the overlap
(only active without whatever runtime debug messing with the layout).

-- 
Mateusz Guzik <mjguzik gmail.com>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ