linux-kernel - Re: ext3/ext4 filesystem corruption under post 5.1.0 kernels

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <CAMuHMdVn9zMsas47CZpWdrFMTu0htn11Dhk459bosFxW7YZv_A@mail.gmail.com>
Date:   Mon, 1 Jul 2019 16:08:13 +0200
From:   Geert Uytterhoeven <geert@...ux-m68k.org>
To:     "Theodore Ts'o" <tytso@....edu>,
        Geert Uytterhoeven <geert@...ux-m68k.org>,
        Arthur Marsh <arthur.marsh@...ernode.on.net>,
        Richard Weinberger <richard.weinberger@...il.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Ext4 Developers List <linux-ext4@...r.kernel.org>
Subject: Re: ext3/ext4 filesystem corruption under post 5.1.0 kernels

Hi Ted,

On Mon, Jul 1, 2019 at 3:56 PM Theodore Ts'o <tytso@....edu> wrote:
> On Mon, Jul 01, 2019 at 02:43:14PM +0200, Geert Uytterhoeven wrote:
> > Despite this fix having been applied upstream,  the kernel prints from
> > time to time:
> >
> >     EXT4-fs (sda1): error count since last fsck: 5
> >     EXT4-fs (sda1): initial error at time 1557931133:
> > ext4_get_branch:171: inode 1980: block 27550
> >     EXT4-fs (sda1): last error at time 1558114349:
> > ext4_get_branch:171: inode 1980: block 27550
> >
> > This happens even after a manual run of "e2fsck -f" (while it's mounted
> > RO), which reports a clean file system.
>
> What's happening is this.  When the kernel detects a corruption, newer
> kernels will set these superblock fields:
>
>         __le32  s_error_count;          /* number of fs errors */
>         __le32  s_first_error_time;     /* first time an error happened */
>         __le32  s_first_error_ino;      /* inode involved in first error */
>         __le64  s_first_error_block;    /* block involved of first error */
>         __u8    s_first_error_func[32] __nonstring;     /* function where the error happened */
>         __le32  s_first_error_line;     /* line number where error happened */
>         __le32  s_last_error_time;      /* most recent time of an error */
>         __le32  s_last_error_ino;       /* inode involved in last error */
>         __le32  s_last_error_line;      /* line number where error happened */
>         __le64  s_last_error_block;     /* block involved of last error */
>         __u8    s_last_error_func[32] __nonstring;      /* function where the error happened */
>
> When newer versions of e2fsck *fix* the corruption, it will clear
> these fields.  It's basically a safety check because *way* too many
> ext4 users run with errors=continue (aka, "don't worry, be happy"
> mode), and so this is a poke in the system logs that the file system
> is corrupted, and they, really, *REALLY* should fix it before they
> lose (more) data.

Thanks for the explanation, much appreciated!

> > The inode and block numbers match the numbers printed due to the
> > previous bug.
>
> You can also see when the last file system error was detected via:
>
> % date -d @1558114349
> Fri 17 May 2019 01:32:29 PM EDT

Good. So no new errors detected after the fix.

> > Do you have an idea what's wrong?
> > Note that I run a very old version of e2fsck (from a decade ago).
>
> ... and that's the problem.  If you're going to be using newer
> versions of the kernel, you really should be using newer versions of
> e2fsprogs.
>
> There have been a lot of bug fixes in the last 10 years, and some of
> them can be data corruption bugs....

Yeah, one day I'll have to change the winning horse...

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@...ux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds