lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200225172355.GA14617@mit.edu>
Date:   Tue, 25 Feb 2020 12:23:55 -0500
From:   "Theodore Y. Ts'o" <tytso@....edu>
To:     Jean-Louis Dupond <jean-louis@...ond.be>
Cc:     linux-ext4@...r.kernel.org
Subject: Re: Filesystem corruption after unreachable storage

On Tue, Feb 25, 2020 at 02:19:09PM +0100, Jean-Louis Dupond wrote:
> FYI,
> 
> Just did same test with e2fsprogs 1.45.5 (from buster backports) and kernel
> 5.4.13-1~bpo10+1.
> And having exactly the same issue.
> The VM needs a manual fsck after storage outage.
> 
> Don't know if its useful to test with 5.5 or 5.6?
> But it seems like the issue still exists.

This is going to be a long shot, but if you could try testing with
5.6-rc3, or with this commit cherry-picked into a 5.4 or later kernel:

   commit 8eedabfd66b68a4623beec0789eac54b8c9d0fb6
   Author: wangyan <wangyan122@...wei.com>
   Date:   Thu Feb 20 21:46:14 2020 +0800

       jbd2: fix ocfs2 corrupt when clearing block group bits
       
       I found a NULL pointer dereference in ocfs2_block_group_clear_bits().
       The running environment:
               kernel version: 4.19
               A cluster with two nodes, 5 luns mounted on two nodes, and do some
               file operations like dd/fallocate/truncate/rm on every lun with storage
               network disconnection.
       
       The fallocate operation on dm-23-45 caused an null pointer dereference.
       ...

... it would be interesting to see if fixes things for you.  I can't
guarantee that it will, but the trigger of the failure which wangyan
found is very similar indeed.

Thanks,

						- Ted

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ