linux-kernel - Re: [RFC PATCH v1 0/7] Block/XFS: Support alternative mirror device retry

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20181210043015.GS24487@magnolia>
Date:   Sun, 9 Dec 2018 20:30:15 -0800
From:   "Darrick J. Wong" <darrick.wong@...cle.com>
To:     Bob Liu <bob.liu@...cle.com>
Cc:     Christoph Hellwig <hch@...radead.org>,
        Dave Chinner <david@...morbit.com>,
        Allison Henderson <allison.henderson@...cle.com>,
        linux-block@...r.kernel.org, linux-xfs@...r.kernel.org,
        linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
        martin.petersen@...cle.com, shirley.ma@...cle.com
Subject: Re: [RFC PATCH v1 0/7] Block/XFS: Support alternative mirror device
 retry

On Sat, Dec 08, 2018 at 10:49:44PM +0800, Bob Liu wrote:
> On 11/28/18 3:45 PM, Christoph Hellwig wrote:
> > On Wed, Nov 28, 2018 at 04:33:03PM +1100, Dave Chinner wrote:
> >> 	- how does propagation through stacked layers work?
> > 
> > The only way it works is by each layering driving it.  Thus my
> > recommendation above bilding on your earlier one to use an index
> > that is filled by the driver at I/O completion time.
> > 
> > E.g.
> > 
> > 	bio_init:		bi_leg = -1
> > 
> > 	raid1:			submit bio to lower driver
> > 	raid 1 completion:	set bi_leg to 0 or 1
> > 
> > Now if we want to allow stacking we need to save/restore bi_leg
> > before submitting to the underlying device.  Which is possible,
> > but quite a bit of work in the drivers.
> > 
> 
> I found it's still very challenge while writing the code.
> save/restore bi_leg may not enough because the drivers don't know how to do fs-metadata verify.
> 
> E.g two layer raid1 stacking
> 
> fs:                  md0(copies:2)
>                      /          \
> layer1/raid1   md1(copies:2)    md2(copies:2)
>                   /    \          /     \
> layer2/raid1   dev0   dev1      dev2    dev3
> 
> Assume dev2 is corrupted
>  => md2: don't know how to do fs-metadata verify. 
>    => md0: fs verify fail, retry md1(preserve md2).
> Then md2 will never be retried even dev3 may also has the right copy.
> Unless the upper layer device(md0) can know the amount of copy is 4 instead of 2? 
> And need a way to handle the mapping.
> Did I miss something? Thanks!

<shrug> It seems reasonable to me that the raid1 layer should set the
number of retries to (number of raid1 mirrors) * min(retry count of all
mirrors) so that the upper layer device (md0) would advertise 4 retry
possibilities instead of 2.

--D


> -Bob
> 
> >> 	- is it generic/abstract enough to be able to work with
> >> 	  RAID5/6 to trigger verification/recovery from the parity
> >> 	  information in the stripe?
> > 
> > If we get the non -1 bi_leg for paritity raid this is an inidicator
> > that parity rebuild needs to happen.  For multi-parity setups we could
> > also use different levels there.
> > 
>