linux-kernel - Re: [RESEND PATCH v7 7/8] kernfs: Replace per-fs rwsem with hashed rwsems.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YjoxDicNK1pTkrKJ@zeniv-ca.linux.org.uk>
Date:   Tue, 22 Mar 2022 20:26:54 +0000
From:   Al Viro <viro@...iv.linux.org.uk>
To:     Tejun Heo <tj@...nel.org>
Cc:     Imran Khan <imran.f.khan@...cle.com>, gregkh@...uxfoundation.org,
        akpm@...ux-foundation.org, linux-kernel@...r.kernel.org
Subject: Re: [RESEND PATCH v7 7/8] kernfs: Replace per-fs rwsem with hashed
 rwsems.

On Tue, Mar 22, 2022 at 07:08:58AM -1000, Tejun Heo wrote:

> > That's interesting...  My impression had been that some of these functions
> > could be called from interrupt contexts (judging by the spin_lock_irqsave()
> > in there).  What kind of async contexts those are, and what do you use to
> > make sure they don't leak into overlap with kernfs_remove()?
> 
> The spin_lock_irqsave()'s are there because they're often used when printing
> messages which can happen from any context. e.g. cpuset ends up calling into
> them to print current's cgroup under rcu_read_lock(), iocost to print
> warning message under an irq-safe lock. In both and similar cases, the
> caller knows that the cgroup is accessible which in turn guarantees that the
> kernfs node hasn't be deleted.

Wait a sec.  Choice of spin_lock_irqsave() vs. spin_lock_irq() is affected by
having it called with interrupts disabled; choice of either vs. spin_lock()
is not - that's needed only if you might end up taking the spinlock in question
from interrupt handler.  "Under rcu_read_lock()" is irrelevant here...

The point of spin_lock_irq/spin_lock_irqsave is the prevention of
	spin_lock(&LOCK); // locked
take an interrupt, enter interrupt handler and there run into
	spin_lock(&LOCK); // and we spin forever
If there's no users in interrupt contexts, we are just fine with plain
spin_lock().

The only thing that matter wrt rcu_read_lock() is that we can't block there;
there are tons of plain spin_lock() calls done in those conditions.  And
rcu_read_lock() doesn't disable interrupts, so spin_lock_irq() is usable
under it.  Now, holding another spinlock with spin_lock_irq{,save}() *does*
prohibit the use of spin_lock_irq() - there you can use only spin_lock()
or spin_lock_irqsave().

The callchains that prohibit spin_lock() do exist - for example, there's
pr_cont_kernfs_path <- pr_cont_cgroup_path <- transfer_surpluses <- ioc_timer_fn.

	Out of curiosity, what guarantees that kernfs_remove() won't do
fun things to ancestors of iocg_to_blkg(iocg)->blkcg->css.cgroup for some
iocg in ioc->active_iocgs, until after ioc_rqos_exit(ioc) has finished
del_timer_sync()?