[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <5B6DC4D5-5EF2-49A6-96AB-690DC94135F8@dilger.ca>
Date: Thu, 24 May 2018 15:10:46 -0600
From: Andreas Dilger <adilger@...ger.ca>
To: Wang Shilong <wshilong@....com>
Cc: Wang Shilong <wangshilong1991@...il.com>,
Ext4 Developers List <linux-ext4@...r.kernel.org>,
Shuichi Ihara <sihara@....com>, Theodore Ts'o <tytso@....edu>
Subject: Re: [PATCH 2/2] ext4: don't RO for bitmaps error in default
On May 23, 2018, at 12:38 AM, Wang Shilong <wshilong@....com> wrote:
> Hello Andreas,
>
> Sorry for late reply, I've got some time to finish this:
>
> Andreas Dilger wrote:
>> I think this looks pretty reasonable, but the replacement of ext4_error()
>> on some of the error paths with ext4_warning() seems like there is a
>> chance that an error might be "lost" if one of the underlying functions
>> misses its call to ext4_mark_group_bitmap_corrupted()->save_error_info().
>>
>> I looked at the patch and the previous one together, and it _looks_ like
>> all of the error paths are handled at some point lower down in the stack,
>> but this has the potential to break in the future if a change doesn't
>> call ext4_mark_group_bitmap_corrupted() itself for some reason.
>>
>> One option would be instead of ext4_warning() (or dropping completely)
>> the ext4_error() messages, use ext4_mark_group_bitmap_corrupted() and
>> change that function to not print an error message if the inode bitmap
>> or block bitmap is already flagged as corrupted. That gives us code
>> safety (the higher-level error check/message will catch any unhandled
>> errors from below), while also reducing console spam (duplicate errors
>> will not be printed, and the "most specific" error will print once to
>> the console).
>
> -> could you please more specific for this? I understood that we could
> add the check in ext4_mark_group_bitmap_corrupted() that only call
> ext4_error()/warning() if necessary, but I did not catch your other point,
> you mean we should move ext4_mark_group_bitmap_corrupted() to higher-level
> call?
No, my point was to somehow ensure either ext4_mark_group_bitmap_corrupted()
is called (marking the group corrupt and calling ext4_warning() at least
once on that group), OR call ext4_error().
I think the 1/2 patch is totally OK, since it is just moving the
ext4_mark_group_bitmap_corrupted() calls into ext4_group_locked_error().
I think there are a few things to do to improve this patch:
- add a check in ext4_mark_group_bitmap_corrupted() if the bitmap is already
marked corrupted then don't print anything at all
- add a mount option like "bitmaps={error,warning}" to allow the user to
decide if corrupt bitmaps should be considered a fatal error or not, and
check this in ext4_mark_group_bitmap_corrupted()
- rather than removing the ext4_error() calls in the higher code paths,
add a check ext4_is_bitmap_corrupted() to check if the bitmap error flag
for that group is set, and don't call ext4_error() if it is. That avoids
hitting ext4_error(), but ensures that it will be called if the underlying
code did not call ext4_mark_group_bitmap_corrupted() for some reason.
Cheers, Andreas
Download attachment "signature.asc" of type "application/pgp-signature" (874 bytes)
Powered by blists - more mailing lists