lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4EAF7033.8070005@tao.ma>
Date:	Tue, 01 Nov 2011 12:06:11 +0800
From:	Tao Ma <tm@....ma>
To:	Ted Ts'o <tytso@....edu>
CC:	Andreas Dilger <adilger@...mcloud.com>,
	linux-ext4 development <linux-ext4@...r.kernel.org>,
	Alex Zhuravlev <bzzz@...mcloud.com>,
	"hao.bigrat@...il.com" <hao.bigrat@...il.com>
Subject: Re: bigalloc and max file size

On 11/01/2011 04:00 AM, Ted Ts'o wrote:
> On Mon, Oct 31, 2011 at 06:27:25PM +0800, Tao Ma wrote:
>> In the new bigalloc case if chunk size=64k, and with the linux-3.0
>> source, every file will be allocated a chunk, but they aren't contiguous
>> if we only write the 1st 4k bytes. In this case, writeback and the block
>> layer below can't merge all the requests sent by ext4. And in our test
>> case, the total io will be around 20000. While with the cluster size, we
>> have to zero the whole cluster. From the upper point of view. we have to
>> write more bytes. But from the block layer, the write is contiguous and
>> it can merge them to be a big one. In our test, it will only do around
>> 2000 ios. So it helps the test case.
> 
> This is test case then where there are lot of sub-64k files, and so
> the system administrator would be ill-advised to use a 64k bigalloc
> cluster size in the first place.  So don't really consider that a
> strong argument; in fact, if the block device is a SSD or a
> thin-provisioned device with an allocation size smaller than the
> cluster size, the behaviour you describe would in fact be detrimental,
> not a benefit.
OK, actually the above test case is more natural if we replace umount
with sync. And I guess this is the most common case for a normal desktop
user. Even without sync, the disk util will be very high. As now the SSD
isn't popular in normal user's env, I would imagine more guy will
complain about it when bigalloc get merged.
> 
> In the case of a hard drive where seeks are expensive relative to
> small writes, this is something which we could do (zero out the whole
> cluster) with the current bigalloc file system format.  I could
> imagine trying to turn this on automatically with a hueristic, but
> since we can't know the underlying allocation size of a
> thin-provisioned block device, that would be tricky at best...
OK, if we would decide to leave extent length to be block length, we can
do some tricky thing like cfq to read the rotational flag of the
underlying device. It is a bit pain, but we have to handle it as I
mention above.

Thanks
Tao
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ