lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250112181201.GL6156@frogsfrogsfrogs>
Date: Sun, 12 Jan 2025 10:12:01 -0800
From: "Darrick J. Wong" <djwong@...nel.org>
To: Matthew Wilcox <willy@...radead.org>
Cc: Theodore Ts'o <tytso@....edu>, "Artem S. Tashkinov" <aros@....com>,
	linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org
Subject: Re: Spooling large metadata updates / Proposal for a new API/feature
 in the Linux Kernel (VFS/Filesystems):

On Sun, Jan 12, 2025 at 11:58:53AM +0000, Matthew Wilcox wrote:
> On Sun, Jan 12, 2025 at 12:27:43AM -0500, Theodore Ts'o wrote:
> > So yes, it basically exists, although in practice, it doesn't work as
> > well as you might think, because of the need to read potentially a
> > large number of the metdata blocks.  But for example, if you make sure
> > that all of the inode information is already cached, e.g.:
> > 
> >    ls -lR /path/to/large/tree > /dev/null
> > 
> > Then the operation to do a bulk update will be fast:
> > 
> >   time chown -R root:root /path/to/large/tree
> > 
> > This demonstrates that the bottleneck tends to be *reading* the
> > metdata blocks, not *writing* the metadata blocks.
> 
> So if we presented more of the operations to the kernel at once, it
> could pipeline the reading of the metadata, providing a user-visible
> win.
> 
> However, I don't know that we need a new user API to do it.  This is
> something that could be done in the "rm" tool; it has the information
> it needs, and it's better to put heuristics like "how far to read ahead"
> in userspace than the kernel.

nr_cpus=$(getconf _NPROCESSORS_ONLN)
find $path -print0 | xargs -P $nr_cpus -0 chown root:root

deltree is probably harder, because while you can easily parallelize
deleting the leaves, find isn't so good at telling you what are the
leaves.  I suppose you could do:

find $path ! -type d -print0 | xargs -P $nr_cpus -0 rm -f
rm -r -f $path

which would serialize on all the directories, but hopefully there aren't
that many of those?

FWIW as Amir said, xfs truncates and frees inodes in the background now
so most of the upfront overhead of rm -r -f is reading in metadata,
deleting directory entries, and putting the files on the unlinked list.

--D

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ