lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250626.eifo4iChohWo@digikod.net>
Date: Thu, 26 Jun 2025 11:43:49 +0200
From: Mickaël Salaün <mic@...ikod.net>
To: Song Liu <songliubraving@...a.com>
Cc: NeilBrown <neil@...wn.name>, Tingmao Wang <m@...wtm.org>, 
	Song Liu <song@...nel.org>, "bpf@...r.kernel.org" <bpf@...r.kernel.org>, 
	"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>, "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, 
	"linux-security-module@...r.kernel.org" <linux-security-module@...r.kernel.org>, "brauner@...nel.org" <brauner@...nel.org>, 
	Kernel Team <kernel-team@...a.com>, "andrii@...nel.org" <andrii@...nel.org>, 
	"eddyz87@...il.com" <eddyz87@...il.com>, "ast@...nel.org" <ast@...nel.org>, 
	"daniel@...earbox.net" <daniel@...earbox.net>, "martin.lau@...ux.dev" <martin.lau@...ux.dev>, 
	"viro@...iv.linux.org.uk" <viro@...iv.linux.org.uk>, "jack@...e.cz" <jack@...e.cz>, 
	"kpsingh@...nel.org" <kpsingh@...nel.org>, "mattbobrowski@...gle.com" <mattbobrowski@...gle.com>, 
	Günther Noack <gnoack@...gle.com>
Subject: Re: [PATCH v5 bpf-next 0/5] bpf path iterator

On Thu, Jun 26, 2025 at 05:52:50AM +0000, Song Liu wrote:
> 
> 
> > On Jun 25, 2025, at 6:05 PM, NeilBrown <neil@...wn.name> wrote:
> 
> [...]
> 
> >> 
> >> I can't speak for Mickaël, but a callback-based interface is less flexible
> >> (and _maybe_ less performant?).  Also, probably we will want to fallback
> >> to a reference-taking walk if the walk fails (rather than, say, retry
> >> infinitely), and this should probably use Song's proposed iterator.  I'm
> >> not sure if Song would be keen to rewrite this iterator patch series in
> >> callback style (to be clear, it doesn't necessarily seem like a good idea
> >> to me, and I'm not asking him to), which means that we will end up with
> >> the reference walk API being a "call this function repeatedly", and the
> >> rcu walk API taking a callback.  I think it is still workable (after all,
> >> if Landlock wants to reuse the code in the callback it can just call the
> >> callback function itself when doing the reference walk), but it seems a
> >> bit "ugly" to me.
> > 
> > call-back can have a performance impact (less opportunity for compiler
> > optimisation and CPU speculation), though less than taking spinlock and
> > references.  However Al and Christian have drawn a hard line against
> > making seq numbers visible outside VFS code so I think it is the
> > approach most likely to be accepted.
> > 
> > Certainly vfs_walk_ancestors() would fallback to ref-walk if rcu-walk
> > resulted in -ECHILD - just like all other path walking code in namei.c.
> > This would be largely transparent to the caller - the caller would only
> > see that the callback received a NULL path indicating a restart.  It
> > wouldn't need to know why.

Given the constraints this looks good to me.  Here is an updated API
with two extra consts, an updated walk_cb() signature, and a new
"flags" and without @root:

int vfs_walk_ancestors(struct path *path,
                       bool (*walk_cb)(const struct path *ancestor, void *data),
                       void *data, int flags)

The walk continue while walk_cb() returns true.  walk_cb() can then
check if @ancestor is equal to a @root, or other properties.  The
walk_cb() return value (if not bool) should not be returned by
vfs_walk_ancestors() because a walk stop doesn't mean an error.

@path would be updated with latest ancestor path (e.g. @root).
@flags could contain LOOKUP_RCU or not, which enables us to have
walk_cb() not-RCU compatible.

When passing LOOKUP_RCU, if the first call to vfs_walk_ancestors()
failed with -ECHILD, the caller can restart the walk by calling
vfs_walk_ancestors() again but without LOOKUP_RCU.

> 
> I guess I misunderstood the proposal of vfs_walk_ancestors() 
> initially, so some clarification:
> 
> I think vfs_walk_ancestors() is good for the rcu-walk, and some 
> rcu-then-ref-walk. However, I don’t think it fits all use cases. 
> A reliable step-by-step ref-walk, like this set, works well with 
> BPF, and we want to keep it. 

The above updated API should work for both use cases: if the caller wants
to walk only one level, walk_cb() can just always return false (and
potentially save that it was called) and then stop the walk the first
time it is called.  This makes it possible to write an eBPF helper with
the same API as path_walk_parent(), while making the kernel API more
flexible.

> 
> Can we ship this set as-is (or after fixing the comment reported
> by kernel test robot)? I really don’t think we need figure out 
> all details about the rcu-walk here. 
> 
> Once this is landed, we can try implementing the rcu-walk
> (vfs_walk_ancestors or some variation). If no one volunteers, I
> can give it a try. 

My understanding is that Christian only wants one helper (that should
handle both use cases).  I think this updated API should be good enough
for everyone.  Most of your code should stay the same.  What do you
think?

> 
> Thanks,
> Song
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ