linux-kernel - Re: [PATCH v2 5/8] ext4: abort journal on data writeback failure if in data

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <cuanusobp4oiptmmyruiimehe25zizbgdxgx5a7oudvo6repox@drpwkdfp7hpw>
Date: Tue, 21 Jan 2025 12:13:12 +0100
From: Jan Kara <jack@...e.cz>
To: libaokun@...weicloud.com
Cc: linux-ext4@...r.kernel.org, tytso@....edu, adilger.kernel@...ger.ca, 
	jack@...e.cz, linux-kernel@...r.kernel.org, yi.zhang@...wei.com, 
	yangerkun@...wei.com, Baokun Li <libaokun1@...wei.com>
Subject: Re: [PATCH v2 5/8] ext4: abort journal on data writeback failure if
 in data_err=abort mode

On Tue 21-01-25 15:10:47, libaokun@...weicloud.com wrote:
> From: Baokun Li <libaokun1@...wei.com>
> 
> The data_err=abort was initially introduced to address users' worries
> about data corruption spreading unnoticed. With direct writes, we can
> rely on return values to confirm successful writes to disk. But with
> buffered writes, a successful return only means the data has been written
> to memory. Users have no way of knowing if the data has actually written
> it to disk unless they use fsync (which impacts performance and can
> sometimes miss errors).
> 
> The current data_err=abort implementation relies on the ordered data list,
> but past changes have inadvertently altered its behavior. For example, if
> an extent is unwritten, we do not add the inode to the ordered data list.
> Therefore, jbd2 will not wait for the data write-back of that inode to
> complete and check for errors in the inode mapping. Moreover, the checks
> performed by jbd2 can also miss errors.
> 
> Now, all buffered writes eventually call ext4_end_bio(), where I/O errors
> are checked. Therefore, we can check for the data_err=abort mode at this
> point and abort the journal in a kworker (due to the interrupt context).
> 
> Therefore, when data_err=abort is enabled, the journal is aborted in
> ext4_end_io_end() when an I/O error is detected in ext4_end_bio() to make
> users who are concerned about the contents of the file happy.
> 
> Suggested-by: Jan Kara <jack@...e.cz>
> Link: https://patch.msgid.link/c7ab26f3-85ad-4b31-b132-0afb0e07bf79@huawei.com
> Signed-off-by: Baokun Li <libaokun1@...wei.com>
> Reviewed-by: Zhang Yi <yi.zhang@...wei.com>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@...e.cz>

Just one naming suggestion below:

> +#define EXT4_IO_END_NEED_COMPLETION (EXT4_IO_END_UNWRITTEN | EXT4_IO_END_FAILED)

I'd call this EXT4_IO_END_DEFER_COMPLETION

> +static bool ext4_io_end_need_completion(ext4_io_end_t *io_end)

And this would then be ext4_io_end_defer_completion().

								Honza
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR