[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <d63fc6fc-f848-4d78-b9d2-b7baf9f19467@sandeen.net>
Date: Tue, 12 May 2020 22:16:00 -0500
From: Eric Sandeen <sandeen@...deen.net>
To: julio.lajara@...m.rpi.edu, linux-ext4@...r.kernel.org
Subject: Re: Reducing ext4 fs issues resulting from frequent hard poweroffs
On 5/12/20 4:08 PM, Julio Lajara wrote:
> Hi all, I currently manage an IOT fleet based on Intel NUCs running
> Ubuntu 18.04 Server on SSDs with etx4, no swap. The device usage is
> more CPU bound than I/O bound and we are having some issues keeping a
> subset of devices running due to them being hard powered off in the
> field in some regions (sometimes as frequently as every 12hrs). Due to
> current difficulties in getting devices back from the field I'm
> looking into tweaking them as best as possible to survive these hard
> power off barring any physical SSD issues.
I don't think you've actually said what the failure mode after power
loss is, have you?
> Currently I have tried tweaking some ext4 and I/O settings with the following:
>
> * kernel options:
> elevator=noop fsck.mode=force fsck.repair=yes
>
> * fstab ext4 specific mount options:
> commit=1,max_batch_time=0
>
> Are there any other configuration settings or changes to the above
> that would make sense to try here for this use case? I am hoping to at
> least make the fsck repair the last line of defence so it doesnt get
> stuck waiting for a prompt to repair it at boot, but want to try to
> change the I/O / ext4 behavior if possible so its writing as
> frequently as sanely possible to try to reduce the frequency where
> fsck is actually needed.
I can't tell from this why fsck is needed in the first place; what
actually goes wrong when power is lost? Ted's right that properly
behaving hardware should not require any special attention after
power loss to restore filesystem consistency, but I can't tell for
sure what your actual root cause for boot failure is from this
email...
-Eric
Powered by blists - more mailing lists