[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <200908201743.50167.eike-kernel@sf-tec.de>
Date: Thu, 20 Aug 2009 17:43:41 +0200
From: Rolf Eike Beer <eike-kernel@...tec.de>
To: Mark Lord <liml@....ca>
Cc: Ric Wheeler <rwheeler@...hat.com>, Ingo Molnar <mingo@...e.hu>,
Christoph Hellwig <hch@...radead.org>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Paul Mackerras <paulus@...ba.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
xfs@....sgi.com, linux-fsdevel@...r.kernel.org,
linux-scsi@...r.kernel.org, linux-kernel@...r.kernel.org,
jens.axboe@...cle.com,
"IDE/ATA development list" <linux-ide@...r.kernel.org>,
Neil Brown <neilb@...e.de>
Subject: Re: [PATCH, RFC] xfs: batched discard support
Mark Lord wrote:
> Ric Wheeler wrote:
> > Note that returning consistent data is critical for devices that are
> > used in a RAID group since you will need each RAID block that is used to
> > compute the parity to continue to return the same data until you
> > overwrite it with new data :-)
> >
> > If we have a device that does not support this (or is misconfigured not
> > to do this), we should not use those devices in an MD group & do discard
> > against it...
>
> ..
>
> Well, that's a bit drastic. But the RAID software should at least
> not issue TRIM commands in ignorance of such.
>
> Would it still be okay to do the TRIMs when the entire parity stripe
> (across all members) is being discarded? (As opposed to just partial
> data there being dropped)
I think there might be a related usecase that could benefit from
TRIM/UNMAP/whatever support in file systems even if the physical devices do
not support that. I have a RAID5 at work with LVM over it. This week I deleted
an old logical volume of some 200GB that has been moved to a different volume
group, tomorrow I will start to replace all the disks in the raid with bigger
ones. So if the LVM told the raid "hey, this space is totally garbage from now
on" the raid would not have to do any calculation when it has to rebuild that
but could simply write fixed patterns to all disks (e.g. 0 to first data, 0 to
second data and 0 as "0 xor 0" to parity). With the knowledge that some of the
underlying devices would support "write all to zero" this operation could be
speed up even more, with "write all fixed pattern" every unused chunk would go
down to a single write operation (per disk) on rebuild regardless which parity
algorithm is used.
And even if things are in use the RAID can benefit from such things. If we
just define that every unmapped space will always be 0 when read and I write
to a raid volume and the other part of the checksum calculation is unmapped
checksumming becomes easy as we already know half of the values before: 0. So
we can save the reads from the second data stripe and most of the calculation.
"dd if=/dev/md0" on an unmapped space is more or less the same as "dd
if=/dev/zero" than.
I only fear that these things are too obviously as I would be the first to
have this idea ;)
Greetings,
Eike
Download attachment "signature.asc " of type "application/pgp-signature" (199 bytes)
Powered by blists - more mailing lists