[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGudoHGRMnPo7bAGGz6VcvzMMROjyWapO3nYdb+fW-iDEYDbgQ@mail.gmail.com>
Date: Thu, 13 Jun 2024 20:41:23 +0200
From: Mateusz Guzik <mjguzik@...il.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Christian Brauner <brauner@...nel.org>, viro@...iv.linux.org.uk, jack@...e.cz,
linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH 1/2] lockref: speculatively spin waiting for the lock to
be released
On Thu, Jun 13, 2024 at 8:13 PM Mateusz Guzik <mjguzik@...il.com> wrote:
>
> On Thu, Jun 13, 2024 at 7:00 PM Linus Torvalds
> <torvalds@...ux-foundation.org> wrote:
> >
> > On Thu, 13 Jun 2024 at 06:46, Christian Brauner <brauner@...nel.org> wrote:
> > >
> > > I've picked Linus patch and your for testing into the vfs.inode.rcu branch.
> >
> > Btw, if you added [patch 2/2] too, I think the exact byte counts in
> > the comments are a bit misleading.
> >
> > The actual cacheline details will very much depend on 32-bit vs 64-bit
> > builds, but also on things like the slab allocator debug settings.
> >
>
> I indeed assumed "x86_64 production", with lines just copied from pahole.
>
> However, to the best of my understanding the counts are what one
> should expect on a 64-bit kernel.
>
> That said:
>
> > I think the important part is to keep the d_lockref - that is often
> > dirty and exclusive in the cache - away from the mostly RO parts of
> > the dentry that can be shared across CPU's in the cache.
> >
> > So rather than talk about exact byte offsets, maybe just state that
> > overall rule?
> >
>
> I would assume the rule is pretty much well known and instead an
> indicator where is what (as added in my comments) would be welcome.
>
> But if going that route, then perhaps:
> "Make sure d_lockref does not share a cacheline with fields used by
> RCU lookup, otherwise it can result in a signification throughput
> reduction. You can use pahole(1) to check the layout."
> [maybe a link to perfbook or something goes here?]
>
> Arguably a bunch of BUILD_BUG_ONs could be added to detect the overlap
> (only active without whatever runtime debug messing with the layout).
>
So I tried to add the BUILD_BUG_ONs but I got some compilation errors
immediately concerning the syntax, maybe I'm brainfarting here. I am
not pasting that in case you want to take a stab yourself from
scratch.
I did however type up the following:
/*
* Note: dentries have a read-mostly area heavily used by RCU (denoted with
* "RCU lookup touched fields") which must not share cachelines with a
* frequently written-to field like d_lockref.
*
* If you are modifying the layout you can check with pahole(1) that there is
* no overlap.
*/
could land just above the struct.
That's my $0,03. I am not going to give further commentary on the
matter, you guys touch it up however you see fit.
--
Mateusz Guzik <mjguzik gmail.com>
Powered by blists - more mailing lists