[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9BD19ABC-08B8-4976-912D-DFCC06C29CAA@meta.com>
Date: Thu, 26 Jun 2025 05:52:50 +0000
From: Song Liu <songliubraving@...a.com>
To: NeilBrown <neil@...wn.name>
CC: Tingmao Wang <m@...wtm.org>,
Mickaël Salaün
<mic@...ikod.net>,
Song Liu <song@...nel.org>,
"bpf@...r.kernel.org"
<bpf@...r.kernel.org>,
"linux-fsdevel@...r.kernel.org"
<linux-fsdevel@...r.kernel.org>,
"linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>,
"linux-security-module@...r.kernel.org"
<linux-security-module@...r.kernel.org>,
"brauner@...nel.org"
<brauner@...nel.org>,
Kernel Team <kernel-team@...a.com>,
"andrii@...nel.org"
<andrii@...nel.org>,
"eddyz87@...il.com" <eddyz87@...il.com>,
"ast@...nel.org" <ast@...nel.org>,
"daniel@...earbox.net"
<daniel@...earbox.net>,
"martin.lau@...ux.dev" <martin.lau@...ux.dev>,
"viro@...iv.linux.org.uk" <viro@...iv.linux.org.uk>,
"jack@...e.cz"
<jack@...e.cz>,
"kpsingh@...nel.org" <kpsingh@...nel.org>,
"mattbobrowski@...gle.com" <mattbobrowski@...gle.com>,
Günther Noack <gnoack@...gle.com>
Subject: Re: [PATCH v5 bpf-next 0/5] bpf path iterator
> On Jun 25, 2025, at 6:05 PM, NeilBrown <neil@...wn.name> wrote:
[...]
>>
>> I can't speak for Mickaël, but a callback-based interface is less flexible
>> (and _maybe_ less performant?). Also, probably we will want to fallback
>> to a reference-taking walk if the walk fails (rather than, say, retry
>> infinitely), and this should probably use Song's proposed iterator. I'm
>> not sure if Song would be keen to rewrite this iterator patch series in
>> callback style (to be clear, it doesn't necessarily seem like a good idea
>> to me, and I'm not asking him to), which means that we will end up with
>> the reference walk API being a "call this function repeatedly", and the
>> rcu walk API taking a callback. I think it is still workable (after all,
>> if Landlock wants to reuse the code in the callback it can just call the
>> callback function itself when doing the reference walk), but it seems a
>> bit "ugly" to me.
>
> call-back can have a performance impact (less opportunity for compiler
> optimisation and CPU speculation), though less than taking spinlock and
> references. However Al and Christian have drawn a hard line against
> making seq numbers visible outside VFS code so I think it is the
> approach most likely to be accepted.
>
> Certainly vfs_walk_ancestors() would fallback to ref-walk if rcu-walk
> resulted in -ECHILD - just like all other path walking code in namei.c.
> This would be largely transparent to the caller - the caller would only
> see that the callback received a NULL path indicating a restart. It
> wouldn't need to know why.
I guess I misunderstood the proposal of vfs_walk_ancestors()
initially, so some clarification:
I think vfs_walk_ancestors() is good for the rcu-walk, and some
rcu-then-ref-walk. However, I don’t think it fits all use cases.
A reliable step-by-step ref-walk, like this set, works well with
BPF, and we want to keep it.
Can we ship this set as-is (or after fixing the comment reported
by kernel test robot)? I really don’t think we need figure out
all details about the rcu-walk here.
Once this is landed, we can try implementing the rcu-walk
(vfs_walk_ancestors or some variation). If no one volunteers, I
can give it a try.
Thanks,
Song
Powered by blists - more mailing lists