[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190313144012.GE9108@quack2.suse.cz>
Date: Wed, 13 Mar 2019 15:40:12 +0100
From: Jan Kara <jack@...e.cz>
To: Petr Mladek <pmladek@...e.com>
Cc: Jan Kara <jack@...e.cz>, Ted Tso <tytso@....edu>,
linux-ext4@...r.kernel.org
Subject: Re: [PATCH] ext4: Avoid panic during forced reboot
On Wed 13-03-19 11:25:14, Petr Mladek wrote:
> On Tue 2019-03-12 17:01:39, Jan Kara wrote:
> > On Tue 12-03-19 13:21:06, Petr Mladek wrote:
> > > On Tue 2019-03-12 10:49:06, Jan Kara wrote:
> > > > When admin calls "reboot -f" - i.e., does a hard system reboot by
> > > > directly calling reboot(2) - ext4 filesystem mounted with errors=panic
> > > > can panic the system. This happens because the underlying device gets
> > > > disabled without unmounting the filesystem and thus some syscall running
> > > > in parallel to reboot(2) can result in the filesystem getting IO errors.
> > > >
> > > > This is somewhat surprising to the users so try improve the behavior by
> > > > switching to errors=remount-ro behavior when the system is running
> > > > reboot(2).
> > > >
> > > > @@ -460,7 +466,12 @@ static void ext4_handle_error(struct super_block *sb)
> > > > if (journal)
> > > > jbd2_journal_abort(journal, -EIO);
> > > > }
> > > > - if (test_opt(sb, ERRORS_RO)) {
> > > > + /*
> > > > + * We force ERRORS_RO behavior when system is rebooting. Otherwise we
> > > > + * could panic during 'reboot -f' as the underlying device got already
> > > > + * disabled.
> > > > + */
> > > > + if (test_opt(sb, ERRORS_RO) || system_going_down()) {
> > >
> > > If I read the code correctly then this will not avoid the panic().
> >
> > No, you are right. Attached is a fixed up patch.
>
> > >From 15b830d4e877ed908c733ab3219801d1026af256 Mon Sep 17 00:00:00 2001
> > From: Jan Kara <jack@...e.cz>
> > Date: Tue, 12 Mar 2019 10:38:19 +0100
> > Subject: [PATCH] ext4: Avoid panic during forced reboot
> >
> > When admin calls "reboot -f" - i.e., does a hard system reboot by
> > directly calling reboot(2) - ext4 filesystem mounted with errors=panic
> > can panic the system. This happens because the underlying device gets
> > disabled without unmounting the filesystem and thus some syscall running
> > in parallel to reboot(2) can result in the filesystem getting IO errors.
> >
> > This is somewhat surprising to the users so try improve the behavior by
> > switching to errors=remount-ro behavior when the system is running
> > reboot(2).
> >
> > Signed-off-by: Jan Kara <jack@...e.cz>
> > ---
> > fs/ext4/super.c | 16 +++++++++++++---
> > 1 file changed, 13 insertions(+), 3 deletions(-)
> >
> > diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> > index 60da0a6e4d86..b7b621d5d87a 100644
> > --- a/fs/ext4/super.c
> > +++ b/fs/ext4/super.c
> > @@ -430,6 +430,12 @@ static void ext4_journal_commit_callback(journal_t *journal, transaction_t *txn)
> > spin_unlock(&sbi->s_md_lock);
> > }
> >
> > +static bool system_going_down(void)
> > +{
> > + return system_state == SYSTEM_HALT || system_state == SYSTEM_POWER_OFF
> > + || system_state == SYSTEM_RESTART;
> > +}
> > +
> > /* Deal with the reporting of failure conditions on a filesystem such as
> > * inconsistencies detected or read IO failures.
> > *
> > @@ -460,7 +466,12 @@ static void ext4_handle_error(struct super_block *sb)
> > if (journal)
> > jbd2_journal_abort(journal, -EIO);
> > }
> > - if (test_opt(sb, ERRORS_RO)) {
> > + /*
> > + * We force ERRORS_RO behavior when system is rebooting. Otherwise we
> > + * could panic during 'reboot -f' as the underlying device got already
> > + * disabled.
> > + */
> > + if (test_opt(sb, ERRORS_RO) || system_going_down()) {
> > ext4_msg(sb, KERN_CRIT, "Remounting filesystem read-only");
> > /*
> > * Make sure updated value of ->s_mount_flags will be visible
> > @@ -468,8 +479,7 @@ static void ext4_handle_error(struct super_block *sb)
> > */
> > smp_wmb();
> > sb->s_flags |= SB_RDONLY;
> > - }
> > - if (test_opt(sb, ERRORS_PANIC)) {
> > + } else if (test_opt(sb, ERRORS_PANIC)) {
>
> I think that this would work. Just note that it changes a bit
> the semantic also when the system is in the running state.
>
> ERRORS_RO option always takes precedence over ERRORS_PANIC now.
> It was the other way before.
I know. But they couldn't be set both at the same time so this should not
matter.
Honza
--
Jan Kara <jack@...e.com>
SUSE Labs, CR
Powered by blists - more mailing lists