linux-kernel - Re: [PATCH] btrfs: add fs state details to error messages.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20220215150438.GN12643@twin.jikos.cz>
Date:   Tue, 15 Feb 2022 16:04:38 +0100
From:   David Sterba <dsterba@...e.cz>
To:     Sweet Tea Dorminy <sweettea-kernel@...miny.me>
Cc:     Chris Mason <clm@...com>, Josef Bacik <josef@...icpanda.com>,
        David Sterba <dsterba@...e.com>, linux-btrfs@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] btrfs: add fs state details to error messages.

On Sat, Feb 12, 2022 at 02:10:42PM -0500, Sweet Tea Dorminy wrote:
> When a filesystem goes read-only due to an error, multiple errors tend
> to be reported, some of which are knock-on failures. Logging some
> fs_states, if any, in btrfs_handle_fs_error() and btrfs_printk()
> helps distinguish the first error from subsequent messages which may
> only exist due to an error state.
> 
> Under the new format, most initial errors will look like:
> `BTRFS: error (device loop0) in ...`
> while subsequent errors will begin with:
> `error (device loop0: state E) in ...`
> 
> An initial transaction abort error will look like
> `error (device loop0: state X) in ...`
> and subsequent messages will contain
> `(device loop0: state EX) in ...`
> 
> Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@...miny.me>
> ---
>  fs/btrfs/super.c | 49 +++++++++++++++++++++++++++++++++++++++---------
>  1 file changed, 40 insertions(+), 9 deletions(-)
> 
> diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
> index 33cfc9e27451..d0e81eb48eac 100644
> --- a/fs/btrfs/super.c
> +++ b/fs/btrfs/super.c
> @@ -66,6 +66,31 @@ static struct file_system_type btrfs_root_fs_type;
>  
>  static int btrfs_remount(struct super_block *sb, int *flags, char *data);
>  
> +#define STATE_STRING_PREFACE ": state "
> +#define MAX_STATE_CHARS 2
> +
> +static void btrfs_state_to_string(const struct btrfs_fs_info *info, char *buf)
> +{
> +	unsigned long state = info->fs_state;
> +	char *curr = buf;
> +
> +	memcpy(curr, STATE_STRING_PREFACE, sizeof(STATE_STRING_PREFACE));
> +	curr += sizeof(STATE_STRING_PREFACE) - 1;
> +
> +	/* If more states are reported, update MAX_STATE_CHARS also */
> +	if (test_and_clear_bit(BTRFS_FS_STATE_ERROR, &state))

The state is supposed to be sticky, so can't be cleared. Also as I read
the suggested change, the "state: " should be visible for all messages
that are printed after the filesystem state changes.

> +		*curr++ = 'E';
> +
> +	if (test_and_clear_bit(BTRFS_FS_STATE_TRANS_ABORTED, &state))
> +		*curr++ = 'X';
> +
> +	/* If no states were printed, reset the buffer */
> +	if (state == info->fs_state)
> +		curr = buf;
> +
> +	*curr++ = '\0';
> +}
> +
>  /*
>   * Generally the error codes correspond to their respective errors, but there
>   * are a few special cases.
> @@ -128,6 +153,7 @@ void __btrfs_handle_fs_error(struct btrfs_fs_info *fs_info, const char *function
>  {
>  	struct super_block *sb = fs_info->sb;
>  #ifdef CONFIG_PRINTK
> +	char statestr[sizeof(STATE_STRING_PREFACE) + MAX_STATE_CHARS];
>  	const char *errstr;
>  #endif
>  
> @@ -136,10 +162,11 @@ void __btrfs_handle_fs_error(struct btrfs_fs_info *fs_info, const char *function
>  	 * under SB_RDONLY, then it is safe here.
>  	 */
>  	if (errno == -EROFS && sb_rdonly(sb))
> -  		return;
> +		return;

Unnecessary change.

>  
>  #ifdef CONFIG_PRINTK
>  	errstr = btrfs_decode_error(errno);
> +	btrfs_state_to_string(fs_info, statestr);
>  	if (fmt) {
>  		struct va_format vaf;
>  		va_list args;
> @@ -148,12 +175,12 @@ void __btrfs_handle_fs_error(struct btrfs_fs_info *fs_info, const char *function
>  		vaf.fmt = fmt;
>  		vaf.va = &args;
>  
> -		pr_crit("BTRFS: error (device %s) in %s:%d: errno=%d %s (%pV)\n",
> -			sb->s_id, function, line, errno, errstr, &vaf);
> +		pr_crit("BTRFS: error (device %s%s) in %s:%d: errno=%d %s (%pV)\n",
> +			sb->s_id, statestr, function, line, errno, errstr, &vaf);

Alternatively the state message can be built into the message itself so
it does not require the extra array.


		pr_crit("btrfs: error(device %s%s%s%s) ...",
			sb->s_id,
			statebits ? ", state" : "",
			test_bit(FS_ERRROR) ? "E" : "",
			test_bit(TRANS_ABORT) ? "T" : "", ...);

This is the idea, final code can use some wrappers around the bit
constatnts.


>  		va_end(args);
>  	} else {
> -		pr_crit("BTRFS: error (device %s) in %s:%d: errno=%d %s\n",
> -			sb->s_id, function, line, errno, errstr);
> +		pr_crit("BTRFS: error (device %s%s) in %s:%d: errno=%d %s\n",
> +			sb->s_id, statestr, function, line, errno, errstr);

Filling the temporary array makes sense as it's used twice, however I'm
not sure if the 'else' branch is ever executed.