linux-kernel - Re: [RESEND PATCH v7 7/8] kernfs: Replace per-fs rwsem with hashed rwsems.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YjjP5ldCCGYqD+UV@slm.duckdns.org>
Date:   Mon, 21 Mar 2022 09:20:06 -1000
From:   Tejun Heo <tj@...nel.org>
To:     Al Viro <viro@...iv.linux.org.uk>
Cc:     Imran Khan <imran.f.khan@...cle.com>, gregkh@...uxfoundation.org,
        akpm@...ux-foundation.org, linux-kernel@...r.kernel.org
Subject: Re: [RESEND PATCH v7 7/8] kernfs: Replace per-fs rwsem with hashed
 rwsems.

Hello,

On Mon, Mar 21, 2022 at 05:55:53PM +0000, Al Viro wrote:
> Why bother with rwsem, when we don't need anything blocking under it?
> DEFINE_RWLOCK instead of DEFINE_SPINLOCK and don't make it static.

Oh I mean, in case the common readers get way too hot, percpu_rwsem is a
relatively easy way to shift the burder from the readers to the writers. I
doubt we'll need that.

> kernfs_walk_ns() - this is fucking insane; on the surface, it needs to
> be exclusive due to the use of the same static buffer.  It uses that
> buffer to generate a pathname, *THEN* walks over it with strsep().
> That's an... interesting approach, for the lack of other printable
> terms - we walk the chain of ancestors, concatenating their names
> into a buffer and separating those names with slashes, then we walk
> that buffer, searching for slashes...  WTF?

It takes the @parent to walk string @path from. Where does it generate the
pathname?

> kernfs_rename_ns() - exclusive; that's where the tree topology gets
> changed.

This is the only true writer and it shouldn't be difficult to convert the
others to read lock w/ e.g. dynamic allocations or percpu buffers.

> So we can just turn that spinlock into rwlock, replace the existing
> uses with read_lock()/read_unlock() in kernfs_{name,path_from_node,get_parent}
> and with write_lock()/write_unlock() in the rest of fs/kernfs/dir.c,
> make it non-static, put extern into kernfs-internal.h and there you
> go...
> 
> Wait a sec; what happens if e.g. kernfs_path_from_node() races with
> __kernfs_remove()?  We do _not_ clear ->parent, but we do drop references
> that used to pin what it used to point to, unless I'm misreading that
> code...  Or is it somehow prevented by drain-related logics?  Seeing
> that it seems to be possible to have kernfs_path_from_node() called from
> an interrupt context, that could be delicate...

kernfs_remove() is akin to freeing of the node and all its descendants. The
caller shouldn't be racing that against any other operations in the subtree.

Thanks.

-- 
tejun