[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Yjo9g6+PkxeRgNif@slm.duckdns.org>
Date: Tue, 22 Mar 2022 11:20:03 -1000
From: Tejun Heo <tj@...nel.org>
To: Al Viro <viro@...iv.linux.org.uk>
Cc: Imran Khan <imran.f.khan@...cle.com>, gregkh@...uxfoundation.org,
akpm@...ux-foundation.org, linux-kernel@...r.kernel.org
Subject: Re: [RESEND PATCH v7 7/8] kernfs: Replace per-fs rwsem with hashed
rwsems.
On Tue, Mar 22, 2022 at 08:26:54PM +0000, Al Viro wrote:
> The only thing that matter wrt rcu_read_lock() is that we can't block there;
> there are tons of plain spin_lock() calls done in those conditions. And
> rcu_read_lock() doesn't disable interrupts, so spin_lock_irq() is usable
> under it. Now, holding another spinlock with spin_lock_irq{,save}() *does*
> prohibit the use of spin_lock_irq() - there you can use only spin_lock()
> or spin_lock_irqsave().
Yeah, I was just listing different cases to make the point that these
functions don't know what context they might get called in.
> The callchains that prohibit spin_lock() do exist - for example, there's
> pr_cont_kernfs_path <- pr_cont_cgroup_path <- transfer_surpluses <- ioc_timer_fn.
Yeap.
> Out of curiosity, what guarantees that kernfs_remove() won't do
> fun things to ancestors of iocg_to_blkg(iocg)->blkcg->css.cgroup for some
> iocg in ioc->active_iocgs, until after ioc_rqos_exit(ioc) has finished
> del_timer_sync()?
Ah, okay, I was wrong before when I said that kernfs_remove() is like free.
It puts the base reference but as long as anybody has a ref on a
kernfs_node, the node itself and all its parents are pinned. For iocost,
ioc_pd_free() which is called from __blkg_release() ensures that the iocg is
off the active list. __blkg_release() puts its css reference before calling
ioc_pd_free(). If this were the last css of the cgroup, this can trigger the
destruction of cgroup, so the order doesn't seem right - we should call
blkg_free() first and then put the references. However, given that both css
and cgroup release paths involve a RCU grace period and __blkg_release() is
called from rcu callback, I don't think the race window actually exists.
I'll still send a patch to reorder the puts.
Thanks.
--
tejun
Powered by blists - more mailing lists