[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9F86B62C-0B79-4907-8A20-AE72E7EA2AEC@meta.com>
Date: Thu, 26 Jun 2025 14:49:22 +0000
From: Song Liu <songliubraving@...a.com>
To: Mickaël Salaün <mic@...ikod.net>
CC: NeilBrown <neil@...wn.name>, Tingmao Wang <m@...wtm.org>,
Song Liu
<song@...nel.org>,
"bpf@...r.kernel.org" <bpf@...r.kernel.org>,
"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-security-module@...r.kernel.org"
<linux-security-module@...r.kernel.org>,
"brauner@...nel.org"
<brauner@...nel.org>,
Kernel Team <kernel-team@...a.com>,
"andrii@...nel.org"
<andrii@...nel.org>,
"eddyz87@...il.com" <eddyz87@...il.com>,
"ast@...nel.org" <ast@...nel.org>,
"daniel@...earbox.net"
<daniel@...earbox.net>,
"martin.lau@...ux.dev" <martin.lau@...ux.dev>,
"viro@...iv.linux.org.uk" <viro@...iv.linux.org.uk>,
"jack@...e.cz"
<jack@...e.cz>,
"kpsingh@...nel.org" <kpsingh@...nel.org>,
"mattbobrowski@...gle.com" <mattbobrowski@...gle.com>,
Günther Noack <gnoack@...gle.com>
Subject: Re: [PATCH v5 bpf-next 0/5] bpf path iterator
> On Jun 26, 2025, at 2:43 AM, Mickaël Salaün <mic@...ikod.net> wrote:
[...]
> Given the constraints this looks good to me. Here is an updated API
> with two extra consts, an updated walk_cb() signature, and a new
> "flags" and without @root:
>
> int vfs_walk_ancestors(struct path *path,
> bool (*walk_cb)(const struct path *ancestor, void *data),
> void *data, int flags)
>
> The walk continue while walk_cb() returns true. walk_cb() can then
> check if @ancestor is equal to a @root, or other properties. The
> walk_cb() return value (if not bool) should not be returned by
> vfs_walk_ancestors() because a walk stop doesn't mean an error.
>
> @path would be updated with latest ancestor path (e.g. @root).
> @flags could contain LOOKUP_RCU or not, which enables us to have
> walk_cb() not-RCU compatible.
>
> When passing LOOKUP_RCU, if the first call to vfs_walk_ancestors()
> failed with -ECHILD, the caller can restart the walk by calling
> vfs_walk_ancestors() again but without LOOKUP_RCU.
IIUC, Neil is against using LOOKUP_RCU as input, and VFS folks
may share same the concerns.
>>
>> I guess I misunderstood the proposal of vfs_walk_ancestors()
>> initially, so some clarification:
>>
>> I think vfs_walk_ancestors() is good for the rcu-walk, and some
>> rcu-then-ref-walk. However, I don’t think it fits all use cases.
>> A reliable step-by-step ref-walk, like this set, works well with
>> BPF, and we want to keep it.
>
> The above updated API should work for both use cases: if the caller wants
> to walk only one level, walk_cb() can just always return false (and
> potentially save that it was called) and then stop the walk the first
> time it is called. This makes it possible to write an eBPF helper with
> the same API as path_walk_parent(), while making the kernel API more
> flexible.
I don’t think this will be the same. Current path_walk_parent()
holds a reference on parent path, and returns control to the caller.
However, when vfs_walk_ancestors() returns, it should not hold any
extra reference, right? IOW, vfs_walk_ancestors may hold some
reference between callbacks, but is expected to release these
references before finally returning to the caller.
> Can we ship this set as-is (or after fixing the comment reported
>> by kernel test robot)? I really don’t think we need figure out
>> all details about the rcu-walk here.
>>
>> Once this is landed, we can try implementing the rcu-walk
>> (vfs_walk_ancestors or some variation). If no one volunteers, I
>> can give it a try.
>
> My understanding is that Christian only wants one helper (that should
> handle both use cases). I think this updated API should be good enough
> for everyone. Most of your code should stay the same. What do you
> think?
Christian, could you please clarify this requirement?
Given different expectations in how the references are handled (see
above), I don’t think we can fit all use cases in one API. However,
if we find such an API in the future, which works for all cases, we
can refactor BPF side code to use that instead. Therefore, even we
have a “one API only” requirement, it is not necessary to delay this
set for a RCU-walk API.
Thanks,
Song
Powered by blists - more mailing lists