linux-ext4 - Re: parity raid and ext4 get stuck in writes

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZZZLTSTwP/e/9DCx@li-bb2b2a4c-3307-11b2-a85c-8fa5c3a69313.ibm.com>
Date: Thu, 4 Jan 2024 11:38:13 +0530
From: Ojaswin Mujoo <ojaswin@...ux.ibm.com>
To: Carlos Carvalho <carlos@...ica.ufpr.br>
Cc: linux-ext4@...r.kernel.org, linux-raid@...r.kernel.org
Subject: Re: parity raid and ext4 get stuck in writes

On Fri, Dec 22, 2023 at 05:48:01PM -0300, Carlos Carvalho wrote:
> This is finally a summary of a long standing problem. When lots of writes to
> many files are sent in a short time the kernel gets stuck and stops sending
> write requests to the disks. Sometimes it recovers and finally sends the
> modified pages to permanent storage, sometimes not and eventually other
> functions degrade and the machine crashes.
> 
> A simple way to reproduce: expand a kernel source tree, like
> xzcat linux-6.5.tar.xz | tar x -f -
> 
> With the default vm settings for dirty_background_ratio and dirty_ratio this
> will finish quickly with ~1.5GB of dirty pages in ram and ~100k inodes to be
> written and the kernel gets stuck.
> 
> The bug exists in all 6.* kernels; I've tested the latest release of all
> 6.[1-6]. However some conditions must exist for the problem to appear:
> 
> - there must be many inodes to be flushed; just many bytes in a few files don't
>   show the problem
> - it happens only with ext4 on a parity raid array
> 
> I've moved one of our arrays to xfs and everything works fine, so it's either
> specific to ext4 or xfs is not affected. When the lockup happens the flush
> kworker starts using 100% cpu permanently. I have not observed the bug in
> raid10, only in raid[56].
> 
> The problem is more easily triggered with 6.[56] but 6.1 is also affected.
> 
> Limiting dirty_bytes and dirty_background_bytes to low values reduce the
> probability of lockup, probably because the process generating writes is
> stopped before too many files are created.
 
 Hey Carlos,

 Thanks for sharing this. So as per your comment on the kernel bugzilla,
 it seems like the issue gets fixed for you with stripe=0 as well, so it
 might actually be the same issue. However, most of the people there are
 not able to replicate this in kernel before 6.5, so I'm interested in
 your statement that you see this in 6.1 as well.

 Would it possible to replicate this on 6.1 or any pre 6.5 kernel with
 some perf probes and share the report? I've added the steps to add the
 probes in pre 6.4 kernel here [1] (although it should hopefully work
 with 6.1 - 6.3 as well, since I don't think there'll be much change in
 the functions probed there). The probe would be helpful to confirm if
 the issue we see ion 6.5+ kernels and the one you are seeing in 6.1 is
 the same.

 Thanks,
 ojaswin

 [1] https://bugzilla.kernel.org/show_bug.cgi?id=217965#c36