linux-ext4 - Re: Question: errors=continue behaviour for failed external journal device

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <53D65099.6010901@dobrotescu.ca>
Date:	Mon, 28 Jul 2014 09:31:05 -0400
From:	Vlad Dobrotescu <vlad@...rotescu.ca>
To:	Lukáš Czerner <lczerner@...hat.com>
CC:	Theodore Ts'o <tytso@....edu>, linux-ext4@...r.kernel.org
Subject: Re: Question: errors=continue behaviour for failed external journal
 device

If you are talking about changes, wouldn't "read-only" be a better 
fall-back
alternative for a failed or missing external journal?

Vlad

On 28/07/2014 09:25, Lukáš Czerner wrote:
> On Mon, 28 Jul 2014, Theodore Ts'o wrote:
>
>> Date: Mon, 28 Jul 2014 09:17:42 -0400
>> From: Theodore Ts'o<tytso@....edu>
>> To: Lukáš Czerner<lczerner@...hat.com>
>> Cc: Vlad Dobrotescu<vlad@...rotescu.ca>, linux-ext4@...r.kernel.org
>> Subject: Re: Question: errors=continue behaviour for failed external journal
>>      device
>>
>> On Mon, Jul 28, 2014 at 11:11:45AM +0200, Lukáš Czerner wrote:
>>> I very much agree with that, that's why I was quite surprised that I
>>> found out recently that this is the default. I was living in the
>>> delusion that the default was ERRORS_RO for as long as I can remember.
>>> So my question is, should we change it ? This really does not seem
>>> like a sane default.
>> Yeah, I've been thinking that this would be a good thing to change for
>> 1.43.
>>
>> The only reason that errors=continue was the default was for
>> historical reasons.  I could imagine some system administrators being
>> surprised when all of a sudden their production systems start getting
>> lots of EROFS errors getting reported by applications.  So I could
>> potentially imagine some Help Desks / Support folks at distributions
>> not being enthusiastic about such a change.
>>
>> Hmm.... we are starting to have some errors where we can allow the
>> system to stagger on, even if we need to disallow new allocations in
>> some block groups.  I wonder if it is worthwhile to have a "continue
>> for correctable errors".  The danger, of course, is that some errors,
>> even if they are correctable, (such as freeing a block which is
>> already freed), could be a hint that there are other fs corruptions,
>> not yet detected, that might lead to data loss if we reboot and fsck,
>> or remount readonly right away.  So the question is while there is
>> some value, is it worth the added complexity to add an
>> "errors=continue-correctable" option?
> Right,
>
> I like the idea of the new errors option, even though the name is a
> bit long (maybe "auto") which will try the best to continue, but is
> allowed to remount read only if we can not recover from that error.
>
> This would however need some work to make it work reliably and most
> importantly a fair amount of testing. Though I think it's worth the
> work.
>
> -Lukas
>
>> 							- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html