linux-ext4 - [Bug 201685] ext4 file system corruption

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <bug-201685-13602-RbXtTItOxH@https.bugzilla.kernel.org/>
Date:   Wed, 05 Dec 2018 08:48:45 +0000
From:   bugzilla-daemon@...zilla.kernel.org
To:     linux-ext4@...r.kernel.org
Subject: [Bug 201685] ext4 file system corruption

https://bugzilla.kernel.org/show_bug.cgi?id=201685

--- Comment #263 from Rainer Fiebig (jrf@...lbox.org) ---
(In reply to Guenter Roeck from comment #240)
> As mentioned earlier, I only ever saw the problem on two of four systems
> (see #57), all running the same kernel and the same version of Ubuntu. The
> only differences are mainboard, CPU, and attached drive types.
> 
> I don't think we know for sure what it takes to trigger the problem. We have
> seen various guesses, from gcc version to l1tf mitigation to CPU type,
> broken hard drives, and whatnot. At this time evidence points to the block
> subsystem, with bisect pointing to a commit which relies on the state of the
> HW queue (empty or not) in conjunction with the 'none' io scheduler. This
> may suggest that drive speed and access timing may be involved. That guess
> may of course be just as wrong as all the others.
> 
> Let's just hope that Jens will be able to track down and fix the problem.
> Then we may be able to get a better idea what it actually takes to trigger
> it.

It would indeed be nice to get a short summary *here* of what happened and why,
once the dust has settled.

It would also be interesting to know why all the testing in the run-up to 4.19
didn't catch it, including rc-kernels. It's imo for instance unlikely that
everybody just tested with CONFIG_SCSI_MQ_DEFAULT=n.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.