lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <17108cad-efa8-46b4-a320-70d7b696f75b@huawei.com>
Date: Fri, 3 Jan 2025 17:26:58 +0800
From: Baokun Li <libaokun1@...wei.com>
To: Jan Kara <jack@...e.cz>
CC: "linux-ext4@...r.kernel.org" <linux-ext4@...r.kernel.org>, Theodore Ts'o
	<tytso@....edu>, Christian Brauner <brauner@...nel.org>,
	<sunyongjian1@...wei.com>, Yang Erkun <yangerkun@...wei.com>, Baokun Li
	<libaokun1@...wei.com>
Subject: Re: [BUG REPORT] ext4: “errors=remount-ro” has become “errors=shutdown”?

Hi Honza,

Happy New Year!

On 2025/1/2 23:58, Jan Kara wrote:
> On Mon 30-12-24 15:27:01, Baokun Li wrote:
>> After commit d3476f3dad4a (“ext4: don't set SB_RDONLY after filesystem
>> errors”) in v6.12-rc1, the “errors=remount-ro” mode no longer sets
>> SB_RDONLY on errors, which results in us seeing the filesystem is still
>> in rw state after errors. The issue fixed by this patch is reported as
>> CVE-2024-50191.
>>
>> https://lore.kernel.org/all/2024110851-CVE-2024-50191-f31c@gregkh/
>>
>> This has actually changed the remount-ro semantics. We have some fault
>> injection test cases where we unmount the filesystem after detecting
>> a ro state and then check for consistency. Our customer has a similar
>> scenario. In "errors=remount-ro" mode, some operations are performed
>> after the file system becomes read-only.
> I'm sorry to hear that your fault injection has been broken by my changes.
Never mind. Let's fix this!
>> We reported similar issues to the community in 2020,
>> https://lore.kernel.org/all/20210104160457.GG4018@quack2.suse.cz/
>> Jan Kara provides a simple and effective patch. This patch somehow
>> didn't end up merged into upstream, but this patch has been merged into
>> our internal version for a couple years now and it works fine, is it
>> possible to revert the patch that no longer sets SB_RDONLY and use
>> the patch in the link above?
> Well, the problem with filesystem freezing was just the immediate trigger
> for the changes. But the setting of SB_RDONLY bit by ext4 behind the VFS'
> back (as VFS is generally in charge of manipulating that bit and you must
> hold s_umount for that which we cannot get in ext4 when handling errors)
> was always problematic and I'm almost sure syzbot would find more problems
> with that than just fs freezing. As such I don't think we should really
> return to doing that in ext4 but we need to find other ways how to make
> your error injection to work...
I believe this is actually a bug in the evolution of the freeze
functionality. In v2.6.35-rc1 with commit 18e9e5104fcd ("Introduce
freeze_super and thaw_super for the fsfreeze ioctl"), the introduction
of freeze_super/thaw_super did not cause any problems when setting the
irreversible read-only (ro) bit without locking, because at that time
we used the flag in sb->s_frozen to determine the file system's state.
It was not until v4.3-rc1 with commit 8129ed29644b ("change sb_writers
to use percpu_rw_semaphore") introduced locking into
freeze_super/thaw_super that setting the irreversible ro without locking
caused thaw_super to fail to release the locks that should have been
released, eventually leading to hangs or other issues.

Therefore, I believe that the patch discussed in the previous link is
the correct one, and it has a smaller impact and does not introduce any
mechanism changes. Furthermore, after roughly reviewing the code using
SB_RDONLY, I did not find any logic where setting the irreversible ro
without locking could cause problems. If I have overlooked anything,
please let me know.
>> What's worse is that after commit
>>    95257987a638 ("ext4: drop EXT4_MF_FS_ABORTED flag")
>> was merged in v6.6-rc1, the EXT4_FLAGS_SHUTDOWN bit is set in
>> ext4_handle_error(). This causes the file system to not be read-only
>> when an error is triggered in "errors=remount-ro" mode, because
>> EXT4_FLAGS_SHUTDOWN prevents both writing and reading.
> Here I don't understand what is really the problem with EXT4_MF_FS_ABORTED
> removal. What do you exactly mean by "causes the file system to not be
> read-only"? We still return EROFS where we used to, we disallow writing as
> we used to. Can you perhaps give an example what changed with this commit?
Sorry for the lack of clarity in my previous explanation. The key point
is not about removing EXT4_MF_FS_ABORTED, but rather we will set
EXT4_FLAGS_SHUTDOWN bit, which not only prevents writes but also prevents
reads. Therefore, saying it's not read-only actually means it's completely
unreadable.
>> I'm not sure
>> if this is the intended behavior. But if the intention is to shut down
>> the file system upon an error, I feel it would be better to add an
>> "errors=shutdown" option.
> The point was not really to create new "errors=" mode but rather implement
> the "don't modify the filesystem after we spot error" behavior without
> touching the SB_RDONLY flag.
Yes, both remount-ro and shutdown can ensure that the file system is
not modified, but the former still allows reading the files in the file
system after an error occurs, which is also the literal meaning of
'errors=remount-ro'.
> So how does your framework detect that the filesystem has failed with
> errors=remount-ro? By parsing /proc/mounts or otherwise querying current
> filesystem mount options?
In most cases, run the mount command and filter related options.
> Would it be acceptable for you to look at some
> other mount option (e.g. "shutdown") to detect that state? We could easily
> implement that.
We do need to add a shutdown hint, but that's not the point.

We've discussed this internally, and now if the logs are flushed,
we have no way of knowing if the current filesystem is shutdown. We don't
know if the -EIO from the filesystem is a hardware problem or if the
filesystem is already shutdown. So even if there is no current problem,
we should add some kind of hint to let the user know that the current
filesystem is shutdown.

The changes to display shutdown are as follows, so that we can see if the
current filesystem has been shutdown in the mount command.

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 3955bec9245d..ba28ef0f662e 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -3157,6 +3157,9 @@ static int _ext4_show_options(struct seq_file 
*seq, struct super_block *sb,
         if (nodefs && !test_opt(sb, NO_PREFETCH_BLOCK_BITMAPS))
                 SEQ_OPTS_PUTS("prefetch_block_bitmaps");

+       if (!nodefs && ext4_forced_shutdown(sb))
+               SEQ_OPTS_PUTS("shutdown");
+
         ext4_show_quota_options(seq, sb);
         return 0;
  }
> I'm sorry again for causing you trouble.

Never mind, thank you for your reply!


Regards,
Baokun


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ