linux-kernel - Re: [PATCH] bcache: enhancing the security of dirty data writeback

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <c2awlgl4ih23npqa3k2ilbrbhciv3nfd7wg5xnsjjxikcmednb@nwn3qc7aqhou>
Date: Mon, 4 Aug 2025 00:17:28 -0400
From: Kent Overstreet <kent.overstreet@...ux.dev>
To: Zhou Jifeng <zhoujifeng@...inos.com.cn>
Cc: Coly Li <colyli@...nel.org>, 
	linux-bcache <linux-bcache@...r.kernel.org>, linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] bcache: enhancing the security of dirty data writeback

On Mon, Aug 04, 2025 at 11:47:57AM +0800, Zhou Jifeng wrote:
> On Sun, 3 Aug 2025 at 01:30, Coly Li <colyli@...nel.org> wrote:
> >
> > On Fri, Aug 01, 2025 at 02:10:12PM +0800, Zhou Jifeng wrote:
> > > On Fri, 1 Aug 2025 at 11:42, Kent Overstreet <kent.overstreet@...ux.dev> wrote:
> > > >
> > > > On Fri, Aug 01, 2025 at 11:30:43AM +0800, Zhou Jifeng wrote:
> > > > > On Fri, 1 Aug 2025 at 10:37, Kent Overstreet <kent.overstreet@...ux.dev> wrote:
> > > > > >
> > > > > > On Fri, Aug 01, 2025 at 10:27:21AM +0800, Zhou Jifeng wrote:
> > > > > > > In the writeback mode, the current bcache code uses the
> > > > > > > REQ_OP_WRITE operation to handle dirty data, and clears the bkey
> > > > > > > dirty flag in the btree during the bio completion callback. I think
> > > > > > > there might be a potential risk: if in the event of an unexpected
> > > > > > > power outage, the data in the HDD hardware cache may not have
> > > > > > > had time to be persisted, then the data in the HDD hardware cache
> > > > > > > that is pending processing may be lost. Since at this time the bkey
> > > > > > > dirty flag in the btree has been cleared, the data status recorded
> > > > > > > by the bkey does not match the actual situation of the SSD and
> > > > > > > HDD.
> > > > > > > Am I understanding this correctly?
> > > > > >
> > > > > > For what you're describing, we need to make sure the backing device is
> > > > > > flushed when we're flushing the journal.
> > > > > >
> > > > > > It's possible that this isn't handled correctly in bcache; bcachefs
> > > > > > does, and I wrote that code after bcache - but the bcache version would
> > > > > > look quite different.
> > > > > >
> > > > > > You've read that code more recently than I have - have you checked for
> > > > > > that?
> > > > >
> > > > > In the `write_dirty_finish` function, there is an attempt to update the
> > > > > `bkey` status, but I did not observe any logging writing process. In the
> > > > > core function `journal_write_unlocked` of bcache for writing logs, I
> > > > > also couldn't find the code logic for sending a FLUSH command to the
> > > > > backend HDD.
> > > >
> > > > The right place for it would be in the journal code: before doing a
> > > > journal write, issue flushes to the backing devices.
> > > >
> > > > Can you check for that?
> > > >
> > >
> > > I checked and found that there was no code for sending a flush request
> > > to the backend device before the execution log was written. Additionally,
> > > in the callback function after the dirty data was written back, when it
> > > updated the bkey, it did not insert this update into the log.
> > >
> >
> > It doesn't have to. If the previous dirty version of the key is on cache device
> > already, and power off happens, even the clean version of the key is gone, the
> > dirty version and its data are all valid. If the LBA range of this key is
> > allocated to a new key, a GC must have alrady happened, and the dirty key is
> > invalid due to bucket generation increased. So don't worry, the clean key is
> > unncessary to go into journal in the writeback scenario.
> >
> > IMHO, you may try to flush backing device in a kworker before calling
> > set_gc_sectors() in bch_gc_thread(). The disk cache can be flushed in time
> > before the still-dirty-on-disk keys are invalidated by increase bucket key
> > gen. And also flushing backing device after searched_full_index becomes
> > true in the writeback thread main loop (as you did now).
> >
> 
> The "flush" command previously issued by GC was supposed to alleviate
> the problems in some scenarios. However, I thought of a situation where
> this "flush" command issued before GC might not be sufficient to solve
> the issue.
> 
> Suppose such a scenario: after a power failure, some hardware cache data
> in the HDD is lost, while the corresponding bkey(with the dirty flag cleared)
> update in the SSD has been persisted. After the power is restored, if
> bcache sends a flush before GC, will this cause data loss?

Yes.