lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMuHMdWH4Q6YoE1yV8_KhW4ChK+8RMuAqW25o1pg47Yz5f9nYg@mail.gmail.com>
Date:   Fri, 17 May 2019 11:23:31 +0200
From:   Geert Uytterhoeven <geert@...ux-m68k.org>
To:     "Theodore Ts'o" <tytso@....edu>,
        Richard Weinberger <richard.weinberger@...il.com>,
        Arthur Marsh <arthur.marsh@...ernode.on.net>,
        LKML <linux-kernel@...r.kernel.org>,
        Ext4 Developers List <linux-ext4@...r.kernel.org>
Subject: Re: ext3/ext4 filesystem corruption under post 5.1.0 kernels

Hi Ted,

On Sun, May 12, 2019 at 12:07 AM Theodore Ts'o <tytso@....edu> wrote:
> On Sat, May 11, 2019 at 02:43:16PM +0200, Richard Weinberger wrote:
> > [CC'in linux-ext4]
> >
> > On Sat, May 11, 2019 at 1:47 PM Arthur Marsh
> > <arthur.marsh@...ernode.on.net> wrote:
> > >
> > >
> > > The filesystem with the kernel source tree is the root file system, ext3, mounted as:
> > >
> > > /dev/sdb7 on / type ext3 (rw,relatime,errors=remount-ro)
> > >
> > > After the "Compressing objects" stage, the following appears in dmesg:
> > >
> > > [  848.968550] EXT4-fs error (device sdb7): ext4_get_branch:171: inode #8: block 30343695: comm jbd2/sdb7-8: invalid block
> > > [  849.077426] Aborting journal on device sdb7-8.
> > > [  849.100963] EXT4-fs (sdb7): Remounting filesystem read-only
> > > [  849.100976] jbd2_journal_bmap: journal block not found at offset 989 on sdb7-8
>
> This indicates that the extent tree blocks for the journal was found
> to be corrupt; so the journal couldn't be found.
>
> > > # fsck -yv
> > > fsck from util-linux 2.33.1
> > > e2fsck 1.45.0 (6-Mar-2019)
> > > /dev/sdb7: recovering journal
> > > /dev/sdb7 contains a file system with errors, check forced.
>
> But e2fsck had no problem finding the journal.
>
> > > Pass 1: Checking inodes, blocks, and sizes
> > > Pass 2: Checking directory structure
> > > Pass 3: Checking directory connectivity
> > > Pass 4: Checking reference counts
> > > Pass 5: Checking group summary information
> > > Free blocks count wrong (4619656, counted=4619444).
> > > Fix? yes
> > >
> > > Free inodes count wrong (15884075, counted=15884058).
> > > Fix? yes
>
> And no other significant problems were found.  (Ext4 never updates or
> relies on the summary number of free blocks and free inodes, since
> updating it is a scalability bottleneck and these values can be
> calculated from the per block group free block/inodes count.  So the
> fact that e2fsck needed to update them is not an issue.)
>
> So that implies that we got one set of values when we read the journal
> inode when attempting to mount the file system, and a *different* set
> of values when e2fsck was run.  Which makes means that we need
> consider the possibility that the problem is below the file system
> layer (e.g., the block layer, device drivers, etc.).
>
>
> > > /dev/sdb7: ***** FILE SYSTEM WAS MODIFIED *****
> > >
> > > Other times, I have gotten:
> > >
> > > "Inodes that were part of a corrupted orphan linked list found."
> > > "Block bitmap differences:"
> > > "Free blocks sound wrong for group"
> > >
>
> This variety of issues also implies that the issue may be in the data
> read by the file system, as opposed to an issue in the file system.
>
> Arthur, can you give us the full details of your hardware
> configuration and your kernel config file?  Also, what kernel git
> commit ID were you testing?

I'm seeing similar things running post v5.1 on ARAnyM (Atari emulator):

    EXT4-fs (sda1): mounting ext3 file system using the ext4 subsystem
    ...
    EXT4-fs error (device sda1): ext4_get_branch:171: inode #1980:
block 27550: comm jbd2/sda1-1980: invalid block

and userspace hung somewhere during initial system startup, so I had to
kill the instance.

-----

    EXT4-fs (sda1): mounting ext3 file system using the ext4 subsystem
    EXT4-fs (sda1): INFO: recovery required on readonly filesystem
    EXT4-fs (sda1): write access will be enabled during recovery
    EXT4-fs warning (device sda1): ext4_clear_journal_err:5078:
Filesystem error recorded from previous mount: IO failure
    EXT4-fs warning (device sda1): ext4_clear_journal_err:5079:
Marking fs in need of filesystem check.
    EXT4-fs (sda1): recovery complete
    EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null)
    VFS: Mounted root (ext3 filesystem) readonly on device 8:1.
    ...
    Run /sbin/init as init process
    random: fast init done
    EXT4-fs (sda1): re-mounted. Opts:
    random: crng init done
    EXT4-fs (sda1): re-mounted. Opts: errors=remount-ro
    EXT4-fs (sda1): error count since last fsck: 1
    EXT4-fs (sda1): initial error at time 1557931133:
ext4_get_branch:171: inode 1980: block 27550
    EXT4-fs (sda1): last error at time 1557931133:
ext4_get_branch:171: inode 1980: block 27550

-----

    EXT4-fs (sda1): mounting ext3 file system using the ext4 subsystem
    EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null)
    VFS: Mounted root (ext3 filesystem) readonly on device 8:1.
    ...
    Run /sbin/init as init process
    random: fast init done
    EXT4-fs (sda1): re-mounted. Opts:
    EXT4-fs (sda1): re-mounted. Opts: errors=remount-ro
    random: crng init done
    EXT4-fs error (device sda1): ext4_get_branch:171: inode #1980:
block 27550: comm jbd2/sda1-1980: invalid block
    Aborting journal on device sda1-1980.
    EXT4-fs (sda1): Remounting filesystem read-only
    jbd2_journal_bmap: journal block not found at offset 426 on sda1-1980
    EXT4-fs error (device sda1): ext4_journal_check_start:61: Detected
aborted journal
    EXT4-fs (sda1): error count since last fsck: 3
    EXT4-fs (sda1): initial error at time 1557931133:
ext4_get_branch:171: inode 1980: block 27550
    EXT4-fs (sda1): last error at time 1558083596:
ext4_journal_check_start:61: inode 1980: block 27550
    EXT4-fs error (device sda1): ext4_remount:5328: Abort forced by user

---

    EXT4-fs (sda1): mounting ext3 file system using the ext4 subsystem
    EXT4-fs (sda1): INFO: recovery required on readonly filesystem
    EXT4-fs (sda1): write access will be enabled during recovery
    random: fast init done
    EXT4-fs warning (device sda1): ext4_clear_journal_err:5078:
Filesystem error recorded from previous mount: IO failure
    EXT4-fs warning (device sda1): ext4_clear_journal_err:5079:
Marking fs in need of filesystem check.
    EXT4-fs (sda1): recovery complete
    EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null)
    ...
    Run /sbin/init as init process
    random: crng init done
    EXT4-fs (sda1): re-mounted. Opts:
    EXT4-fs (sda1): re-mounted. Opts: errors=remount-ro
    EXT4-fs (sda1): error count since last fsck: 4
    EXT4-fs (sda1): initial error at time 1557931133:
ext4_get_branch:171: inode 1980: block 27550
    EXT4-fs (sda1): last error at time 1558083665: ext4_remount:5328:
inode 1980: block 27550

Notes:
  - It's always the same block,
  - Block device is an image file, accessed using
    arch/m68k/emu/nfblock.c, which did not receive any recent (bvec)
    updates.
  - There are no reported errors for the device containing the image
    file on the host,
  - Given Arthur sees the issue on a different class of machines, it's
    unlikely the issue is related to a problem with the block device
    (driver). It may still be an issue with the block layer, though,
  - Both Arthur and I are mounting an ext3 file system using the ext4
    subsystem.

Thanks!

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@...ux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ