lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ZZZL/b58ROm+uflo@li-bb2b2a4c-3307-11b2-a85c-8fa5c3a69313.ibm.com>
Date: Thu, 4 Jan 2024 11:41:09 +0530
From: Ojaswin Mujoo <ojaswin@...ux.ibm.com>
To: Daniel Dawson <danielcdawson@...il.com>
Cc: Carlos Carvalho <carlos@...ica.ufpr.br>, linux-ext4@...r.kernel.org,
        linux-raid@...r.kernel.org
Subject: Re: parity raid and ext4 get stuck in writes

On Sun, Dec 24, 2023 at 11:39:05PM -0800, Daniel Dawson wrote:
> On 12/22/23 12:48 PM, Carlos Carvalho wrote:
> > This is finally a summary of a long standing problem. When lots of writes to
> > many files are sent in a short time the kernel gets stuck and stops sending
> > write requests to the disks. Sometimes it recovers and finally sends the
> > modified pages to permanent storage, sometimes not and eventually other
> > functions degrade and the machine crashes.
> > 
> > A simple way to reproduce: expand a kernel source tree, like
> > xzcat linux-6.5.tar.xz | tar x -f -
> This sounds almost exactly like a problem I was having, right down to
> triggering it by writing the files of a kernel tree, though the details in
> my case are slightly different. I wanted to report it, but wanted to get a
> better handle on it and never managed it, and now I've changed my setup such
> that it doesn't happen anymore.
> > - it happens only with ext4 on a parity raid array
> 
> This is where it differs for me. I experienced it only with btrfs. But I had

Hi Daniel,

So I think there are some other people noticing something similar on
btrfs as well [1]. Maybe this is related to the issue you are noticing
although they have not mentioned anything about raid in btrfs.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=2242391

Regards,
ojaswin
> two arrays with it, one on SSDs and one on HDDs. The HDD array exhibited the
> problem almost exclusively (the SSDs, I think, exhibited it once in several
> months, while the HDDs did pretty much every time I tried to compile a new
> kernel (until I started working around it), and even from some other things,
> which was a couple of times a week). I imagine because HDDs much slower and
> therefore allow more data to get cached.
> 
> Now that I've switched the HDD array to ext4, I haven't experienced the
> issue even once. But the setup has better performance, so maybe it's just
> because it flushes its writes faster.
> 
> -- 
> PGP fingerprint: 5BBD5080FEB0EF7F142F8173D572B791F7B4422A
> 

Powered by blists - more mailing lists