lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220809091212.mgreambnhgso5hzw@fedora>
Date:   Tue, 9 Aug 2022 11:12:12 +0200
From:   Lukas Czerner <lczerner@...hat.com>
To:     Jiri Slaby <jirislaby@...nel.org>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        minchan@...nel.org, ngupta@...are.org,
        Sergey Senozhatsky <senozhatsky@...omium.org>,
        Jan Kara <jack@...e.com>, Ted Ts'o <tytso@....edu>,
        Andreas Dilger <adilger.kernel@...ger.ca>,
        Ext4 Developers List <linux-ext4@...r.kernel.org>
Subject: Re: ext2/zram issue [was: Linux 5.19]

On Tue, Aug 09, 2022 at 08:03:11AM +0200, Jiri Slaby wrote:
> Hi,
> 
> On 31. 07. 22, 23:43, Linus Torvalds wrote:
> > So here we are, one week late, and 5.19 is tagged and pushed out.
> > 
> > The full shortlog (just from rc8, obviously not all of 5.19) is below,
> > but I can happily report that there is nothing really interesting in
> > there. A lot of random small stuff.
> 
> Note: I originally reported this downstream for tracking at:
> https://bugzilla.suse.com/show_bug.cgi?id=1202203
> 
> 5.19 behaves pretty weird in openSUSE's openQA (opposing to 5.18, or
> 5.18.15). It's all qemu-kvm "HW"¹⁾:
> https://openqa.opensuse.org/tests/2502148
> loop2: detected capacity change from 0 to 72264
> EXT4-fs warning (device zram0): ext4_end_bio:343: I/O error 10 writing to
> inode 57375 starting block 137216)
> Buffer I/O error on device zram0, logical block 137216
> Buffer I/O error on device zram0, logical block 137217
> ...
> SQUASHFS error: xz decompression failed, data probably corrupt
> SQUASHFS error: Failed to read block 0x2e41680: -5
> SQUASHFS error: xz decompression failed, data probably corrupt
> SQUASHFS error: Failed to read block 0x2e41680: -5
> Bus error
> 
> 
> 
> https://openqa.opensuse.org/tests/2502145
> FS-Cache: Loaded
> begin 644 ldconfig.core.pid_2094.sig_7.time_1659859442
> 
> 
> 
> https://openqa.opensuse.org/tests/2502146
> FS-Cache: Loaded
> begin 644 Xorg.bin.core.pid_3733.sig_6.time_1659858784
> 
> 
> 
> https://openqa.opensuse.org/tests/2502148
> EXT4-fs warning (device zram0): ext4_end_bio:343: I/O error 10 writing to
> inode 57375 starting block 137216)
> Buffer I/O error on device zram0, logical block 137216
> Buffer I/O error on device zram0, logical block 137217
> 
> 
> 
> https://openqa.opensuse.org/tests/2502154
> [   13.158090][  T634] FS-Cache: Loaded
> ...
> [  525.627024][    C0] sysrq: Show State
> 
> 
> 
> Those are various failures -- crashes of ldconfig, Xorg; I/O failures on
> zram; the last one is a lockup likely, something invoked sysrq after 500s
> stall.
> 
> Interestingly, I've also hit this twice locally:
> > init[1]: segfault at 18 ip 00007fb6154b4c81 sp 00007ffc243ed600 error 6 in
> libc.so.6[7fb61543f000+185000]
> > Code: 41 5f c3 66 0f 1f 44 00 00 42 f6 44 10 08 01 0f 84 04 01 00 00 48 83
> e1 fe 48 89 48 08 49 8b 47 70 49 89 5f 70 66 48 0f 6e c0 <48> 89 58 18 0f 16
> 44 24 08 48 81 fd ff 03 00 00 76 08 66 0f ef c9
> > ***  signal 11 ***
> > malloc(): unsorted double linked list corrupted
> > traps: init[1] general protection fault ip:7fb61543f8b9 sp:7ffc243ebf40
> error:0 in libc.so.6[7fb61543f000+185000]
> > Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
> > CPU: 0 PID: 1 Comm: init Not tainted 5.19.0-1-default #1 openSUSE
> Tumbleweed e1df13166a33f423514290c702e43cfbb2b5b575
> 
> KASAN is not helpful either, so it's unlikely a memory corruption (unless it
> is "HW" related; should I try to turn on IOMMU in qemu?):
> > kasan: KernelAddressSanitizer initialized
> > ...
> > zram: module verification failed: signature and/or required key missing - tainting kernel
> > zram: Added device: zram0
> > zram0: detected capacity change from 0 to 2097152
> > EXT4-fs (zram0): mounting ext2 file system using the ext4 subsystem
> > EXT4-fs (zram0): mounted filesystem without journal. Quota mode: none.
> > EXT4-fs warning (device zram0): ext4_end_bio:343: I/O error 10 writing to inode 16386 starting block 159744)
> > Buffer I/O error on device zram0, logical block 159744
> > Buffer I/O error on device zram0, logical block 159745
> 
> 
> 
> They all occur to me like a zram failure. The installer apparently creates
> an ext2 FS and after it mounts it using ext4 module, the issue starts
> occurring.
> 
> Any tests I/you could run on 5.19 to exercise zram and ext2? Otherwise I am
> unable to reproduce easily, except using the openSUSE installer :/.

Hi Jiri,

I've tried a quick xfstests run on ext2 on zram and I can't see any
issues like this so far. I will run a full test and report back in case
there is anything obvious.

-Lukas

> 
> Any other ideas? Or is this known already?
> 
> ¹⁾ main are uefi boot and virtio-blk (it likely happens with virtio-scsi
> too). The cmdline _I_ use: qemu-kvm -device intel-hda -device hda-duplex
> -drive file=/tmp/pokus.qcow2,if=none,id=hd -device virtio-blk-pci,drive=hd
> -drive if=pflash,format=raw,unit=0,readonly=on,file=/usr/share/qemu/ovmf-x86_64-opensuse-code.bin
> -drive if=pflash,format=raw,unit=1,file=/tmp/vars.bin -cdrom /tmp/cd1.iso
> -m 1G -smp 1 -net user -net nic,model=virtio -serial pty -device
> virtio-rng-pci -device qemu-xhci,p2=4,p3=4 -usbdevice tablet
> 
> 
> thanks,
> -- 
> js
> suse labs
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ