[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6knlkefvujkry65gx6636u6e7rivqrn5kqjovs4ctjg7xtzrmo@2zd4wjx6zcym>
Date: Tue, 2 Jul 2024 23:15:44 +0200
From: Mateusz Guzik <mjguzik@...il.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Christian Brauner <brauner@...nel.org>,
kernel test robot <oliver.sang@...el.com>, oe-lkp@...ts.linux.dev, lkp@...el.com,
Linux Memory Management List <linux-mm@...ck.org>, linux-kernel@...r.kernel.org, ying.huang@...el.com,
feng.tang@...el.com, fengwei.yin@...el.com
Subject: Re: [linux-next:master] [lockref] d042dae6ad: unixbench.throughput
-33.7% regression
On Tue, Jul 02, 2024 at 01:42:48PM -0700, Linus Torvalds wrote:
> On Tue, 2 Jul 2024 at 13:33, Mateusz Guzik <mjguzik@...il.com> wrote:
> >
> > If you are politely by lkml standards suggesting I should probably drop
> > the idea due to unforseen complexities
>
> Oh, absolutely not. I'd love to see how nasty - or not nasty - the
> patch would end up being. I think it would be very interesting.
>
> I'm just explaining why _I_ never got around to it.
>
ye I get it, but the above by me was a passing remark anyway :>
I asked you something in the previous e-mail though (with some nastiness
of the problem pointed out) concerning handling of slow vs fastpath,
here it is again:
[..]for example did you know xfs does not honor rcu grace periods when
recycling inodes?
https://lore.kernel.org/all/20231205113833.1187297-1-alexjlzheng@tencent.com/
So this would have to be opt-in per filesystem, probably stuffed
somewhere within the inode or dentry. I am definitely not reviewing all
the other filesystems for sanity on this front.
Rather, one could look over tmpfs, ext4, btrfs and maybe ask Kent to
sort out bcachefs (if necessary) and call it a day.
Sounds like you are deadset on the callback approach. I'm not going to
die on the inline hill, but I will spell it out so that we are on the
same page (and I have a question too).
In pseudo-code my stuff would like this (names are for ilustrative
purposes):
struct rcunameidata {
....
bool in_rcu;
};
...
struct rcunameidata *rnd;
error = vfs_rcu_magic_lookup(&rnd, ....);
if (error)
return error;
if (rnd->in_rcu) {
/*
* fast path goes here, callback code would be identical up to
* the point below
*/
/*
* Now validate
*/
error = vfs_rcu_magic_lookup_validate_or_drop(&rnd, ....))
if (error == 0) /* things worked out */
return export_stuff_to_the_user(....);
if (error < 0) /* fail */
return error;
}
/*
* slowpath goes here
*/
/*
* all done, now whack the lookup state. the routine returns void
*/
vfs_rcu_magic_lookup_finish(&rnd, ....);
if (!error)
error = export_stuff_to_the_user(....);
....
Can you pseudo-code how would the consumer look like in your case? Do
you want the callback to execute for both slow and fastpath and switch
on the flag? It is rather unclear what you are proposing here.
fwiw I think the above would serve as an easy to copy-paste idiom for
the few consumers which want it. All the complexity in their case is the
in_rcu block which wont go away with a callback. If you still want the
callback, callback it is.
Powered by blists - more mailing lists