[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGh96GGJusudLzNePVVKqOwm8U9=HAFyz43Zm9Mypy2F5-PdSw@mail.gmail.com>
Date: Thu, 14 Jul 2011 13:50:31 -0700
From: Erik Jensen <eriksjunk@...nsn.net>
To: John Stoffel <john@...ffel.org>
Cc: Alasdair G Kergon <agk@...hat.com>, NeilBrown <neilb@...e.de>,
Ric Wheeler <rwheeler@...hat.com>,
Nico Schottelius <nico-lkml-20110623@...ottelius.org>,
LKML <linux-kernel@...r.kernel.org>,
Chris Mason <chris.mason@...cle.com>,
linux-btrfs <linux-btrfs@...r.kernel.org>
Subject: Re: Mis-Design of Btrfs?
On Thu, Jul 14, 2011 at 12:50 PM, John Stoffel <john@...ffel.org> wrote:
>>>>>> "Alasdair" == Alasdair G Kergon <agk@...hat.com> writes:
>
> Alasdair> On Thu, Jul 14, 2011 at 04:38:36PM +1000, Neil Brown wrote:
>>> It might make sense for a device to be able to report what the maximum
>>> 'N' supported is... that might make stacked raid easier to manage...
>
> Alasdair> I'll just say that any solution ought to be stackable.
>
> I've been mulling this over too and wondering how you'd handle this,
> because upper layers really can't peak down into lower layers easily.
> As far as I understand things.
>
> So if you have btrfs -> luks -> raid1 -> raid6 -> nbd -> remote disks
>
> How does btrfs handle errors (or does it even see them!) from the
> raid6 level when a single nbd device goes away? Or taking the
> original example, when btrfs notices a checksum isn't correct, how
> would it push down multiple levels to try and find the correct data?
>
> Alasdair> This means understanding both that the number of data access
> Alasdair> routes may vary as you move through the stack, and that this
> Alasdair> number may depend on the offset within the device.
>
> It almost seems to me that the retry needs to be done at each level on
> it's own, without pushing down or up the stack. But this doesn't
> handle the wrong file checksum issue.
>
> Hmm... maybe instead of just one number, we need another to count the
> levels down you go (or just split 16bit integer in half, bottom half
> being count of tries, the upper half being levels down to try that
> read?)
>
> It seems to defeat the purpose of layers if you can go down and find
> out how many layers there are underneath you....
>
> John
A random thought: What if we allow the number to wrap at each level,
and, each time it wraps, increment the number passed to the next lower
level.
A zero would propagate down, letting each level do what it wants:
luks: 0
raid1: 0
raid6: 0
nbd: 0
And higher numbers would indicate the method at each level:
For a 1:
luks: 1
raid1: 1
raid6: 1
nbd: 1
For a 3:
luks: 1 (only one possibility, passes three down)
raid1: 1 (two possibilities, so we wrap back to one and pass two down,
since we wrapped once)
raid6: 2 (not wrapped)
nbd: 1
When the bottom-most level gets an N that it can't handle, it would
return EINVAL, which would be propagated up the stack.
This would allow the same algorithm of incrementing N until we receive
good data or EINVAL, and would exhaust all ways of reading the data at
all levels.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists