linux-ext4 - Re: Locking issue with directory renames

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Y8gejpDqxV6Uo/xY@ZenIV>
Date:   Wed, 18 Jan 2023 16:30:06 +0000
From:   Al Viro <viro@...iv.linux.org.uk>
To:     Jan Kara <jack@...e.cz>
Cc:     linux-fsdevel@...r.kernel.org, linux-ext4@...r.kernel.org,
        Ted Tso <tytso@....edu>, linux-xfs@...r.kernel.org
Subject: Re: Locking issue with directory renames

On Wed, Jan 18, 2023 at 10:10:36AM +0100, Jan Kara wrote:

> 
> Yes, we can lock the source inode in ->rename() if we need it. The snag is
> that if 'target' exists, it is already locked so when locking 'source' we
> are possibly not following the VFS lock ordering of i_rwsem by inode
> address (I don't think it can cause any real dealock but still it looks
> suspicious). Also we'll have to lock with I_MUTEX_NONDIR2 lockdep class to
> make lockdep happy but that's just a minor annoyance. Finally, we'll have
> to check for RENAME_EXCHANGE because in that case, both source and target
> will be already locked. Thus if we do the additional locking in the
> filesystem, we will leak quite some details about rename locking into the
> filesystem which seems undesirable to me.

Rules for inode locks are simple:
	* directories before non-directories
	* ancestors before descendents
	* for non-directories the ordering is by in-core inode address

So the instances that need that extra lock would do that when source is
a directory and non RENAME_EXCHANGE is given.  Having the target already
locked is irrelevant - if it exists, it's already checked to be a directory
as well, and had it been a descendent of source, we would have already
found that and failed with -ELOOP.

If A and B are both directories, there's no ordering between them unless
one is an ancestor of another - such can be locked in any order.
However, one of the following must be true:
	* C is locked and both A and B had been observed to be children of C
after the lock on C had been acquired, or
	* ->s_vfs_rename_mutex is held for the filesystem containing both
A and B.

Note that ->s_vfs_rename_mutex is there to stabilize the tree topology and
make "is A an ancestor of B?" possible to check for more than "A is locked,
B is a child of A, so A will remain its ancestor until unlocked"...