lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <f79f67ff-abb0-123d-064b-a826ec2c6666@gunth.ca>
Date:   Fri, 10 Jun 2022 16:05:48 -0600
From:   Logan Gunthorpe <logan@...th.ca>
To:     LKML <linux-kernel@...r.kernel.org>,
        linux-raid <linux-raid@...r.kernel.org>,
        Song Liu <song@...nel.org>
Cc:     Donald Buczek <buczek@...gen.mpg.de>,
        Guoqing Jiang <guoqing.jiang@...ux.dev>,
        Xiao Ni <xni@...hat.com>, Stephen Bates <sbates@...thlin.com>,
        Martin Oliveira <Martin.Oliveira@...eticom.com>,
        David Sloan <David.Sloan@...eticom.com>
Subject: Raid5 Reshape Data Corruption Bug


Hey,

I've diagnosed a bug in the reshape code that corrupts data, however I
don't have a good solution to the problem and a solution may be quite
complicated. I suspect this is the cause of random failures I see with
01r5integ and 01raid6integ. (Though I can't say for certain as I have a
quicker reproduction method.)

The bug is that during reshape: EXPAND_SOURCE stripes are not in the
correct order when the data comes back from the disk and if a latter
stripe comes back sooner than an earlier EXPAND_SOURCE stripe has read
the disk then an EXPAND_READY stripe might write a block before the
source block was able to read it; so it overwrites data before that data
has been moved and result in corrupt data on the disk. This happens
reasonably frequently.

I suspect this is made worse with modern SSDs and spinning disks would
be less likely to exhibit this problem as it would naturally try to
order the reads by sector.

So somehow there needs to be a way to prevent an EXPAND_READY stripe
from writing the data for a specific device block before the
corresponding EXPAND_SOURCE block has read the data. And I don't see an
trivial way to get that done.

If anyone has any clever solutions it would be good to hear that.
Otherwise, I don't think I'll have time to find a solution myself.

Logan

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ