| lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
|
Open Source and information security mailing list archives
| ||
|
Message-ID: <ed52f171-646f-47ff-ad3b-be8bef48d813@gmail.com> Date: Sun, 24 Dec 2023 23:39:05 -0800 From: Daniel Dawson <danielcdawson@...il.com> To: Carlos Carvalho <carlos@...ica.ufpr.br>, linux-ext4@...r.kernel.org, linux-raid@...r.kernel.org Subject: Re: parity raid and ext4 get stuck in writes On 12/22/23 12:48 PM, Carlos Carvalho wrote: > This is finally a summary of a long standing problem. When lots of writes to > many files are sent in a short time the kernel gets stuck and stops sending > write requests to the disks. Sometimes it recovers and finally sends the > modified pages to permanent storage, sometimes not and eventually other > functions degrade and the machine crashes. > > A simple way to reproduce: expand a kernel source tree, like > xzcat linux-6.5.tar.xz | tar x -f - This sounds almost exactly like a problem I was having, right down to triggering it by writing the files of a kernel tree, though the details in my case are slightly different. I wanted to report it, but wanted to get a better handle on it and never managed it, and now I've changed my setup such that it doesn't happen anymore. > - it happens only with ext4 on a parity raid array This is where it differs for me. I experienced it only with btrfs. But I had two arrays with it, one on SSDs and one on HDDs. The HDD array exhibited the problem almost exclusively (the SSDs, I think, exhibited it once in several months, while the HDDs did pretty much every time I tried to compile a new kernel (until I started working around it), and even from some other things, which was a couple of times a week). I imagine because HDDs much slower and therefore allow more data to get cached. Now that I've switched the HDD array to ext4, I haven't experienced the issue even once. But the setup has better performance, so maybe it's just because it flushes its writes faster. -- PGP fingerprint: 5BBD5080FEB0EF7F142F8173D572B791F7B4422A
Powered by blists - more mailing lists