linux-kernel - Re: [PATCH 03/21] bcachefs: btree write buffer knows how to accumulate bch

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <jyrqedsxmvcbkqpfsgzkdq5kjkh7dbeaavbso32qih6r6d4zno@2so5bwhcd2lc>
Date: Thu, 29 Feb 2024 15:25:05 -0500
From: Kent Overstreet <kent.overstreet@...ux.dev>
To: Brian Foster <bfoster@...hat.com>
Cc: linux-bcachefs@...r.kernel.org, linux-kernel@...r.kernel.org, 
	djwong@...nel.org
Subject: Re: [PATCH 03/21] bcachefs: btree write buffer knows how to
 accumulate bch_accounting keys

On Thu, Feb 29, 2024 at 01:44:07PM -0500, Brian Foster wrote:
> On Wed, Feb 28, 2024 at 05:42:39PM -0500, Kent Overstreet wrote:
> > Shouldn't be any actual risk. It's just new accounting updates that the
> > write buffer can't flush, and those are only going to be generated by
> > interior btree node updates as journal replay has to split/rewrite nodes
> > to make room for its updates.
> > 
> > And for those new acounting updates, updates to the same counters get
> > accumulated as they're flushed from the journal to the write buffer -
> > see the patch for eytzingcer tree accumulated. So we could only overflow
> > if the number of distinct counters touched somehow was very large.
> > 
> > And the number of distinct counters will be growing significantly, but
> > the new counters will all be for user data, not metadata.
> > 
> > (Except: that reminds me, we do want to add per-btree counters, so users
> > can see "I have x amount of extents, x amount of dirents, etc.).
> > 
> 
> Heh, Ok. This all does sound a little open ended to me. Maybe the better
> question is: suppose this hypothetically does happen after adding a
> bunch of new counters, what would the expected side effect be in the
> recovery scenario where the write buffer can't be flushed?

The btree write buffer buf is allowed to grow - we try to keep it
bounded in normal operation, but that's one of the ways we deal with the
unpredictability of the amount of write buffer keys in the journal.

So it'll grow until that kvrealloc fails. It won't show up as a
deadlock, it'll show up as an allocation failure; and for that to
mappen, that would mean the number of accounting keys being update - not
the number of accounting updates, just the number of distinct keys being
updated - is no longer fitting in the write buffer.