lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <ZZZL/b58ROm+uflo@li-bb2b2a4c-3307-11b2-a85c-8fa5c3a69313.ibm.com> Date: Thu, 4 Jan 2024 11:41:09 +0530 From: Ojaswin Mujoo <ojaswin@...ux.ibm.com> To: Daniel Dawson <danielcdawson@...il.com> Cc: Carlos Carvalho <carlos@...ica.ufpr.br>, linux-ext4@...r.kernel.org, linux-raid@...r.kernel.org Subject: Re: parity raid and ext4 get stuck in writes On Sun, Dec 24, 2023 at 11:39:05PM -0800, Daniel Dawson wrote: > On 12/22/23 12:48 PM, Carlos Carvalho wrote: > > This is finally a summary of a long standing problem. When lots of writes to > > many files are sent in a short time the kernel gets stuck and stops sending > > write requests to the disks. Sometimes it recovers and finally sends the > > modified pages to permanent storage, sometimes not and eventually other > > functions degrade and the machine crashes. > > > > A simple way to reproduce: expand a kernel source tree, like > > xzcat linux-6.5.tar.xz | tar x -f - > This sounds almost exactly like a problem I was having, right down to > triggering it by writing the files of a kernel tree, though the details in > my case are slightly different. I wanted to report it, but wanted to get a > better handle on it and never managed it, and now I've changed my setup such > that it doesn't happen anymore. > > - it happens only with ext4 on a parity raid array > > This is where it differs for me. I experienced it only with btrfs. But I had Hi Daniel, So I think there are some other people noticing something similar on btrfs as well [1]. Maybe this is related to the issue you are noticing although they have not mentioned anything about raid in btrfs. [1] https://bugzilla.redhat.com/show_bug.cgi?id=2242391 Regards, ojaswin > two arrays with it, one on SSDs and one on HDDs. The HDD array exhibited the > problem almost exclusively (the SSDs, I think, exhibited it once in several > months, while the HDDs did pretty much every time I tried to compile a new > kernel (until I started working around it), and even from some other things, > which was a couple of times a week). I imagine because HDDs much slower and > therefore allow more data to get cached. > > Now that I've switched the HDD array to ext4, I haven't experienced the > issue even once. But the setup has better performance, so maybe it's just > because it flushes its writes faster. > > -- > PGP fingerprint: 5BBD5080FEB0EF7F142F8173D572B791F7B4422A >
Powered by blists - more mailing lists