linux-kernel - Re: [GIT PULL] bcachefs changes for 6.12-rc1

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <txlsptovy5x44urwd6vd2zey677m7fiqdykoalqosy3qkhbaji@p33zze2l3okt>
Date: Wed, 25 Sep 2024 01:11:46 -0400
From: Kent Overstreet <kent.overstreet@...ux.dev>
To: Dave Chinner <david@...morbit.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>, 
	linux-bcachefs@...r.kernel.org, linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org, 
	Dave Chinner <dchinner@...hat.com>
Subject: Re: [GIT PULL] bcachefs changes for 6.12-rc1

On Wed, Sep 25, 2024 at 02:43:15PM GMT, Dave Chinner wrote:
> On Tue, Sep 24, 2024 at 10:13:01PM -0400, Kent Overstreet wrote:
> > On Wed, Sep 25, 2024 at 11:00:10AM GMT, Dave Chinner wrote:
> > > > Eh? Of course it'd have to be coherent, but just checking if an inode is
> > > > present in the VFS cache is what, 1-2 cache misses? Depending on hash
> > > > table fill factor...
> > > 
> > > Sure, when there is no contention and you have CPU to spare. But the
> > > moment the lookup hits contention problems (i.e. we are exceeding
> > > the cache lookup scalability capability), we are straight back to
> > > running a VFS cache speed instead of uncached speed.
> > 
> > The cache lookups are just reads; they don't introduce scalability
> > issues, unless they're contending with other cores writing to those
> > cachelines - checking if an item is present in a hash table is trivial
> > to do locklessly.
> 
> Which was not something the VFS inode cache did until a couple of
> months ago. Just because something is possible/easy today, it
> doesn't mean it was possible or viable 15-20 years ago.

We've been doing lockless hash table lookups for a lot longer than that.

> > But pulling an inode into and then evicting it from the inode cache
> > entails a lot more work - just initializing a struct inode is
> > nontrivial, and then there's the (multiple) shared data structures you
> > have to manipulate.
> 
> Yes, but to avoid this we'd need to come up with a mechanism that is
> generally safe for most filesystems, not just bcachefs.
> 
> I mean, if you can come up with a stat() mechanism that is safe
> enough for us read straight out the XFS buffer cache for inode cache
> misses, then we'll switch over to using it ASAP.

So add a per buffer lock to your buffer cache, and check for vfs cache
residency under the lock.

Whether bulkstat is worth adding that lock, for a filesystem that
doesn't need it? That I can't say, it's a tradeoff.

> That's your challenge - if you want bcachefs to be able to do this,
> then you have to make sure the infrastructure required works for
> other filesystems just as safely, too.

So... you think bcachefs should be limited to the capabilities of what
other filesystems can do? Interesting position...