[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250612-erraten-bepacken-42675dfcfa82@brauner>
Date: Thu, 12 Jun 2025 14:31:25 +0200
From: Christian Brauner <brauner@...nel.org>
To: Jan Kara <jack@...e.cz>
Cc: Song Liu <song@...nel.org>, Tingmao Wang <m@...wtm.org>,
Mickaël Salaün <mic@...ikod.net>, NeilBrown <neil@...wn.name>, bpf@...r.kernel.org,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-security-module@...r.kernel.org, kernel-team@...a.com, andrii@...nel.org, eddyz87@...il.com,
ast@...nel.org, daniel@...earbox.net, martin.lau@...ux.dev,
viro@...iv.linux.org.uk, kpsingh@...nel.org, mattbobrowski@...gle.com, amir73il@...il.com,
repnop@...gle.com, jlayton@...nel.org, josef@...icpanda.com, gnoack@...gle.com
Subject: Re: [PATCH v3 bpf-next 1/5] namei: Introduce new helper function
path_walk_parent()
On Thu, Jun 12, 2025 at 11:49:08AM +0200, Jan Kara wrote:
> On Thu 12-06-25 11:01:16, Jan Kara wrote:
> > On Wed 11-06-25 11:08:30, Song Liu wrote:
> > > On Wed, Jun 11, 2025 at 10:50 AM Tingmao Wang <m@...wtm.org> wrote:
> > > [...]
> > > > > I think we will need some callback mechanism for this. Something like:
> > > > >
> > > > > for_each_parents(starting_path, root, callback_fn, cb_data, bool try_rcu) {
> > > > > if (!try_rcu)
> > > > > goto ref_walk;
> > > > >
> > > > > __read_seqcount_begin();
> > > > > /* rcu walk parents, from starting_path until root */
> > > > > walk_rcu(starting_path, root, path) {
> > > > > callback_fn(path, cb_data);
> > > > > }
> > > > > if (!read_seqcount_retry())
> > > > > return xxx; /* successful rcu walk */
> > > > >
> > > > > ref_walk:
> > > > > /* ref walk parents, from starting_path until root */
> > > > > walk(starting_path, root, path) {
> > > > > callback_fn(path, cb_data);
> > > > > }
> > > > > return xxx;
> > > > > }
> > > > >
> > > > > Personally, I don't like this version very much, because the callback
> > > > > mechanism is not very flexible, and it is tricky to use it in BPF LSM.
> > > >
> > > > Aside from the "exposing mount seqcounts" problem, what do you think about
> > > > the parent_iterator approach I suggested earlier? I feel that it is
> > > > better than such a callback - more flexible, and also fits in right with
> > > > the BPF API you already designed (i.e. with a callback you might then have
> > > > to allow BPF to pass a callback?). There are some specifics that I can
> > > > improve - Mickaël suggested some in our discussion:
> > > >
> > > > - Letting the caller take rcu_read_lock outside rather than doing it in
> > > > path_walk_parent_start
> > > >
> > > > - Instead of always requiring a struct parent_iterator, allow passing in
> > > > NULL for the iterator to path_walk_parent to do a reference walk without
> > > > needing to call path_walk_parent_start - this way might be simpler and
> > > > path_walk_parent_start/end can just be for rcu case.
> > > >
> > > > but what do you think about the overall shape of it?
> > >
> > > Personally, I don't have strong objections to this design. But VFS
> > > folks may have other concerns with it.
> >
> > From what I've read above I'm not sure about details of the proposal but I
> > don't think mixing of RCU & non-RCU walk in a single function / iterator is
> > a good idea. IMHO the code would be quite messy. After all we have
> > follow_dotdot_rcu() and follow_dotdot() as separate functions for a reason.
> > Also given this series went through several iterations and we don't yet
> > have an acceptable / correct solution suggests getting even the standard
> > walk correct is hard enough. RCU walk is going to be only worse. So I'd
> > suggest to get the standard walk finished and agreed on first and
> > investigate feasibility of RCU variant later.
>
> OK, I've now read some of Tingmaon's and Christian's replies which I've
> missed previously so I guess I now better understand why you complicate
> things with RCU walking but still I'm of the opinion that we should start
> with getting the standard walk working. IMHO pulling in RCU walk into the
> iterator will bring it to a completely new complexity level...
I would not want it in the first place. But I have a deep seated
aversion to exposing two different variants. Especially if the second
variant wants or needs access to internal details such as mount or
dentry sequence counts. I'm not at all in favor of that.
Powered by blists - more mailing lists