lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20191115184209.GT26530@ZenIV.linux.org.uk>
Date:   Fri, 15 Nov 2019 18:42:09 +0000
From:   Al Viro <viro@...iv.linux.org.uk>
To:     Steven Rostedt <rostedt@...dmis.org>
Cc:     Greg KH <gregkh@...uxfoundation.org>, yu kuai <yukuai3@...wei.com>,
        rafael@...nel.org, oleg@...hat.com, mchehab+samsung@...nel.org,
        corbet@....net, tytso@....edu, jmorris@...ei.org,
        linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
        zhengbin13@...wei.com, yi.zhang@...wei.com,
        chenxiang66@...ilicon.com, xiexiuqi@...wei.com
Subject: [RFC] simple_recursive_removal()

On Fri, Nov 15, 2019 at 05:54:23PM +0000, Al Viro wrote:
> Anyway, AFAICS removal could be done this way:
> 
> // parent is held exclusive, after is NULL or a child of parent
> find_next_child(parent, prev)
> 	child = NULL
> 	node = prev ? &prev->d_child : &parent->d_subdirs;
> 	grab parent->d_lock
> 	for each entry in the list starting at node->next
> 		d = container_of(entry, struct dentry, d_child)
> 		grab d->d_lock
> 		if simple_positive(d)
> 			bump d->d_count
> 			child = d
> 		drop d->d_lock
> 		if child
> 			break
> 	drop parent->d_lock
> 	dput(prev);
> 	return child
> 
> kill_it(victim, parent)
> 	if simple_positive(victim)
> 		d_invalidate(victim);	// needed to avoid lost mounts
> 		if victim is a directory
> 			fsnotify_rmdir
> 		else
> 			fsnotify_unlink
> 		if victim is regular
> 			__debugfs_file_removed(victim)
> 		dput(victim)		// unpin it
> 
> recursive_removal(dentry)
> 	this = dentry;
> 	while (true) {
> 		victim = NULL;
> 		inode = this->d_inode;
> 		inode_lock(inode);
> 		if (d_is_dir(this))
> 			mark this doomed
> 		while ((child = find_next_child(this, victim)) == NULL) {
> 			// no children (left); kill and ascend
> 			// update metadata while it's still locked
> 			inode->i_ctime = current_time(inode);
> 			clear_nlink(inode);
> 			inode_unlock(inode);
> 			victim = this;
> 			this = this->d_parent;
> 			inode = this->d_inode;
> 			inode_lock(inode);
> 			kill_it(victim, this);
> 			if (victim == dentry) {
> 				inode->i_ctime = inode->i_mtime = current_time(inode);
> 				if (d_is_dir(dentry))
> 					drop_nlink(inode);
> 				inode_unlock(inode);
> 				dput(dentry);
> 				return;
> 			}
> 		}
> 		inode_unlock(inode);
> 		this = child;
> 	}

Come to think of that, if we use IS_DEADDIR as "no more additions" marking,
that looks like a good candidate for all in-kernel rm -rf on ramfs-style
filesystems without cross-directory renames.  This bit in kill_it() above
 		if victim is regular
 			__debugfs_file_removed(victim)
would be an fs-specific callback passed by the caller, turning the whole
thing into this:

void simple_recursive_removal(struct dentry *dentry,
			      void (*callback)(struct dentry *))
{
	struct dentry *this = dentry;
	while (true) {
		struct dentry *victim = NULL, *child;
		struct inode *inode = this->d_inode;

		inode_lock(inode);
		if (d_is_dir(this))
			inode->i_flags |= S_DEAD;
		while ((child = find_next_child(this, victim)) == NULL) {
			// kill and ascend
			// update metadata while it's still locked
			inode->i_ctime = current_time(inode);
			clear_nlink(inode);
			inode_unlock(inode);
			victim = this;
			this = this->d_parent;
			inode = this->d_inode;
			inode_lock(inode);
			if (simple_positive(victim)) {
		 		d_invalidate(victim);	// avoid lost mounts
				if (is_dir(victim))
					fsnotify_rmdir(inode, victim);
				else
					fsnotify_unlink(inode, victim);
				if (callback)
					callback(victim);
				dput(victim)		// unpin it
			}
			if (victim == dentry) {
				inode->i_ctime = inode->i_mtime =
					current_time(inode);
				if (d_is_dir(dentry))
					drop_nlink(inode);
				inode_unlock(inode);
				dput(dentry);
				return;
			}
		}
		inode_unlock(inode);
		this = child;
	}
}

with find_next_child() easily implemented via scan_positives() already
in libfs.c...  Objections?  The above is obviously completely untested,
and I've got nowhere near enough sleep, so there may be any number of
brown paperbag bugs in it.  Review would be very welcome...

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ