[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250129132311.rQM6LtB2@linutronix.de>
Date: Wed, 29 Jan 2025 14:23:11 +0100
From: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
To: Tejun Heo <tj@...nel.org>
Cc: cgroups@...r.kernel.org, linux-kernel@...r.kernel.org,
Michal Koutný <mkoutny@...e.com>,
"Paul E. McKenney" <paulmck@...nel.org>,
Boqun Feng <boqun.feng@...il.com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Hillf Danton <hdanton@...a.com>,
Johannes Weiner <hannes@...xchg.org>,
Marco Elver <elver@...gle.com>, tglx@...utronix.de
Subject: Re: [PATCH v5 5/6] kernfs: Use RCU to access kernfs_node::parent.
On 2025-01-28 10:31:47 [-1000], Tejun Heo wrote:
> Hello,
Hi,
> Mostly look great to me. Left mostly minor comments.
>
> On Tue, Jan 28, 2025 at 09:42:25AM +0100, Sebastian Andrzej Siewior wrote:
> > @@ -947,10 +947,20 @@ static int rdt_last_cmd_status_show(struct kernfs_open_file *of,
> > return 0;
> > }
> >
> > +static void *rdt_get_kn_parent_priv(struct kernfs_node *kn)
> > +{
>
> nit: Rename rdt_kn_parent_priv() to be consistent with other accessors?
Oh, indeed.
> > diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
> > index 5a1fea414996e..16d268345e3b7 100644
> > --- a/fs/kernfs/dir.c
> > +++ b/fs/kernfs/dir.c
> > @@ -64,9 +64,9 @@ static size_t kernfs_depth(struct kernfs_node *from, struct kernfs_node *to)
> > {
> > size_t depth = 0;
> >
> > - while (to->parent && to != from) {
> > + while (rcu_dereference(to->__parent) && to != from) {
>
> Why not use kernfs_parent() here and other places?
Because it is from within RCU section and the other checks are not
required. If you prefer this instead, I sure can update it.
> > @@ -226,6 +227,7 @@ int kernfs_path_from_node(struct kernfs_node *to, struct kernfs_node *from,
> > unsigned long flags;
> > int ret;
> >
> > + guard(rcu)();
>
> Doesn't irqsave imply rcu?
hmm. It kind of does based on the current implementation but it is not
obvious. We had RCU-sched and RCU which got merged. From then on, the
(implied) preempt-off part of IRQSAVE should imply RCU (section).
It is good to be obvious about RCU.
Also, rcu_dereference() will complain about missing RCU annotation. On
PREEMPT_RT rcu_dereference_sched() will complain because irqsave (in
this case) will not disable interrupts.
> > @@ -558,11 +567,7 @@ void kernfs_put(struct kernfs_node *kn)
> > return;
> > root = kernfs_root(kn);
> > repeat:
> > - /*
> > - * Moving/renaming is always done while holding reference.
> > - * kn->parent won't change beneath us.
> > - */
> > - parent = kn->parent;
> > + parent = kernfs_parent(kn);
>
> Not a strong opinion but I'd keep the comment. Reader can go read the
> definition of kernfs_parent() but no harm in explaining the subtlety where
> it's used.
Okay. will bring it back.
> > @@ -1376,7 +1388,7 @@ static void kernfs_activate_one(struct kernfs_node *kn)
> > if (kernfs_active(kn) || (kn->flags & (KERNFS_HIDDEN | KERNFS_REMOVING)))
> > return;
> >
> > - WARN_ON_ONCE(kn->parent && RB_EMPTY_NODE(&kn->rb));
> > + WARN_ON_ONCE(kernfs_parent(kn) && RB_EMPTY_NODE(&kn->rb));
>
> Minor but this one can be rcu_access_pointer() too.
ok.
> > @@ -1794,7 +1813,7 @@ static struct kernfs_node *kernfs_dir_pos(const void *ns,
> > {
> > if (pos) {
> > int valid = kernfs_active(pos) &&
> > - pos->parent == parent && hash == pos->hash;
> > + kernfs_parent(pos) == parent && hash == pos->hash;
>
> Ditto with rcu_access_pointer(). Using kernfs_parent() here is fine too but
> it's a bit messy to mix the two for similar cases. Let's stick to either
> rcu_access_pointer() or kernfs_parent().
I make both (kernfs_activate_one() and kernfs_dir_pos) use
rcu_access_pointer() then.
> > diff --git a/fs/kernfs/kernfs-internal.h b/fs/kernfs/kernfs-internal.h
> > index b42ee6547cdc1..c43bee18b79f7 100644
> > --- a/fs/kernfs/kernfs-internal.h
> > +++ b/fs/kernfs/kernfs-internal.h
> > @@ -64,11 +66,14 @@ struct kernfs_root {
> > *
> > * Return: the kernfs_root @kn belongs to.
> > */
> > -static inline struct kernfs_root *kernfs_root(struct kernfs_node *kn)
> > +static inline struct kernfs_root *kernfs_root(const struct kernfs_node *kn)
> > {
> > + const struct kernfs_node *knp;
> > /* if parent exists, it's always a dir; otherwise, @sd is a dir */
> > - if (kn->parent)
> > - kn = kn->parent;
> > + guard(rcu)();
> > + knp = rcu_dereference(kn->__parent);
> > + if (knp)
> > + kn = knp;
> > return kn->dir.root;
> > }
>
> This isn't a new problem but the addition of the rcu guard makes it stick
> out more: What keeps the returned root safe to dereference?
As far as I understand it kernfs_root is around as long as the
filesystem itself is around which means at least one node needs to stay.
If you have a pointer to a kernfs_node you should own a reference.
The RCU section is only needed to ensure that the (current) __parent is
not replaced and then deallocated before the caller had a chance to
obtain the root pointer.
> > diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
> > index d9061bd55436b..214aa378936cd 100644
> > --- a/kernel/cgroup/cgroup.c
> > +++ b/kernel/cgroup/cgroup.c
> > @@ -633,9 +633,22 @@ int cgroup_task_count(const struct cgroup *cgrp)
> > return count;
> > }
> >
> > +static struct cgroup *kn_get_priv(struct kernfs_node *kn)
> > +{
> > + struct kernfs_node *parent;
> > + /*
> > + * The parent can not be replaced due to KERNFS_ROOT_INVARIANT_PARENT.
> > + * Therefore it is always safe to dereference this pointer outside of a
> > + * RCU section.
> > + */
> > + parent = rcu_dereference_check(kn->__parent,
> > + kernfs_root_flags(kn) & KERNFS_ROOT_INVARIANT_PARENT);
> > + return parent->priv;
> > +}
>
> kn_priv()?
Oh, yes.
> Thanks.
Sebastian
Powered by blists - more mailing lists