lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4AA2579F.9010802@redhat.com>
Date:	Sat, 05 Sep 2009 08:20:47 -0400
From:	Ric Wheeler <rwheeler@...hat.com>
To:	Pavel Machek <pavel@....cz>
CC:	Rob Landley <rob@...dley.net>, jim owens <jowens@...com>,
	david@...g.hm, Theodore Tso <tytso@....edu>,
	Florian Weimer <fweimer@....de>,
	Goswin von Brederlow <goswin-v-b@....de>,
	kernel list <linux-kernel@...r.kernel.org>,
	Andrew Morton <akpm@...l.org>, mtk.manpages@...il.com,
	rdunlap@...otime.net, linux-doc@...r.kernel.org,
	linux-ext4@...r.kernel.org, corbet@....net
Subject: Re: [testcase] test your fs/storage stack (was Re: [patch] ext2/3:
 document conditions when reliable operation is possible)

On 09/05/2009 06:28 AM, Pavel Machek wrote:
> On Fri 2009-09-04 07:49:34, Ric Wheeler wrote:
>    
>> On 09/04/2009 03:44 AM, Rob Landley wrote:
>>      
>>> On Thursday 03 September 2009 09:14:43 jim owens wrote:
>>>
>>>        
>>>> Rob Landley wrote:
>>>>
>>>>          
>>>>> I think he understands he was clueless too, that's why he investigated
>>>>> the failure and wrote it up for posterity.
>>>>>
>>>>>
>>>>>            
>>>>>> And Ric said do not stigmatize whole classes of A) devices, B) raid,
>>>>>> and C) filesystems with "Pavel says...".
>>>>>>
>>>>>>              
>>>>> I don't care what "Pavel says", so you can leave the ad hominem at the
>>>>> door, thanks.
>>>>>
>>>>>            
>>>> See, this is exactly the problem we have with all the proposed
>>>> documentation.  The reader (you) did not get what the writer (me)
>>>> was trying to say.  That does not say either of us was wrong in
>>>> what we thought was meant, simply that we did not communicate.
>>>>
>>>>          
>>> That's why I've mostly stopped bothering with this thread.  I could respond to
>>> Ric Wheeler's latest (what does write barriers have to do with whether or not
>>> a multi-sector stripe is guaranteed to be atomically updated during a panic or
>>> power failure?) but there's just no point.
>>>
>>>        
>> The point of that post was that the failure that you and Pavel both
>> attribute to RAID and journalled fs happens whenever the storage cannot
>> promise to do atomic writes of a logical FS block (prevent torn
>> pages/split writes/etc). I gave a specific example of why this happens
>> even with simple, single disk systems.
>>      
> ext3 does not expect atomic write of 4K block, according to Ted. So
> no, it is not broken on single disk.
>    

I am not sure what you mean by "expect."

ext3 (and other file systems) certainly expect that acknowledged writes 
will still be there after a crash.

With your disk write cache on (and no working barriers or non-volatile 
write cache), this will always require a repair via fsck or leave you 
with corrupted data or metadata.

ext4, btrfs and zfs all do checksumming of writes, but this is a 
detection mechanism.

Repair of the partial write is done on detection (if you have another 
copy in btrfs or xfs) or by repair (ext4's fsck).

For what it's worth, this is the same story with databases (DB2, Oracle, 
etc). They spend a lot of energy trying to detect partial writes from 
the application level's point of view and their granularity is often 
multiple fs blocks....

>
>    
>>> The LWN article on the topic is out, and incomplete as it is I expect it's the
>>> best documentation anybody will actually _read_.
>>>        
> Would anyone (probably privately?) share the lwn link?
> 								Pavel
>    

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ