[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090830071957.GA1656@ucw.cz>
Date: Sun, 30 Aug 2009 09:19:57 +0200
From: Pavel Machek <pavel@....cz>
To: david@...g.hm
Cc: Rob Landley <rob@...dley.net>, Theodore Tso <tytso@....edu>,
Rik van Riel <riel@...hat.com>,
Ric Wheeler <rwheeler@...hat.com>,
Florian Weimer <fweimer@....de>,
Goswin von Brederlow <goswin-v-b@....de>,
kernel list <linux-kernel@...r.kernel.org>,
Andrew Morton <akpm@...l.org>, mtk.manpages@...il.com,
rdunlap@...otime.net, linux-doc@...r.kernel.org,
linux-ext4@...r.kernel.org, corbet@....net
Subject: Re: [patch] ext2/3: document conditions when reliable operation is
possible
Hi!
>> I thought the reason for that was that if your metadata is horked, further
>> writes to the disk can trash unrelated existing data because it's lost track
>> of what's allocated and what isn't. So back when the assumption was "what's
>> written stays written", then keeping the metadata sane was still darn
>> important to prevent normal operation from overwriting unrelated existing
>> data.
>>
>> Then Pavel notified us of a situation where interrupted writes to the disk can
>> trash unrelated existing data _anyway_, because the flash block size on the 16
>> gig flash key I bought retail at Fry's is 2 megabytes, and the filesystem thinks
>> it's 4k or smaller. It seems like what _broke_ was the assumption that the
>> filesystem block size >= the disk block size, and nobody noticed for a while.
>> (Except the people making jffs2 and friends, anyway.)
>>
>> Today we have cheap plentiful USB keys that act like hard drives, except that
>> their write block size isn't remotely the same as hard drives', but they
>> pretend it is, and then the block wear levelling algorithms fuzz things
>> further. (Gee, a drive controller lying about drive geometry, the scsi crowd
>> should feel right at home.)
>
> actually, you don't know if your USB key works that way or not. Pavel has
> ssome that do, that doesn't mean that all flash drives do
>
> when you do a write to a flash drive you have to do the following items
>
> 1. allocate an empty eraseblock to put the data on
>
> 2. read the old eraseblock
>
> 3. merge the incoming write to the eraseblock
>
> 4. write the updated data to the flash
>
> 5. update the flash trnslation layer to point reads at the new location
> instead of the old location.
That would need two erases per single sector writen, no? Erase is in
milisecond range, so the performance would be just way too bad :-(.
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists