lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Sun, 6 May 2007 13:04:10 +0800 From: "Xu CanHao" <xucanhao@...il.com> To: 7eggert@....de Cc: "Theodore Tso" <tytso@....edu>, linux-kernel@...r.kernel.org Subject: Re: Ext3 vs NTFS performance 2007/5/6, Bodo Eggert <7eggert@....de>: > Theodore Tso <tytso@....edu> wrote: > > > But as has already been discussed on this thread, in situations where > > the fileserver is under high memory pressure, any filesystem (XFS or > > ext4) would still end up allocating blocks out of order, resulting in > > fragmentation. Explicit preallocation, as opposed to delayed > > allocation, is really the best long-term solution; and in order to do > > that, Samba needs to detect this scenario --- which as has been noted, > > there appears to be no good reason for the Windows CIFS client (or any > > other application)to be doing this, other than perhaps to deliberate > > trigger a worst case allocation pattern in ext3 --- and translate it > > into a explicit preallocation request. > > There is an interface to tell the kernel about the way the file will be > accessed. IMO this interface should be used to do the preallocation, too. > > The other question is: How to tell the poor-bill's preallocation from a > very clever application that communicates with another application and > which is supposed to zero out that exact byte from the data the other > application sent. I was tempted to say "just let samba cache these calls", > but it would be wrong. You'll need magic in the kernel to DTRT. > > There are three correct ways of handling these one-zerobyte-writes after EOF: > > 1) Extend the file like truncate > 2) Extend the file like write() (current behaviour) > 3) Preallocate these blocks (to be implemented) > 4) Write all zeroes (current behaviour for FAT) > > (2) will cause bad allocations, it's obviously worse than (1). (3) would be > better than (1) and (2), but only xfs(?) and ext4 will support this in the > near future. (4) should double the write time, but give the best possible > read speed. According to [1], the expected read speed is about as high as (1) > gives, "playback performance improves to expected levels". If preallocation > does not seem to make a big difference, I don't think we should do (4) as > a replacement untill the filesystem does support real preallocations. > > > I suggest: > > 1) Make samba use fadvise(MIGHT_PREALLOCATE) > 2) Make the kernel turn these 1-byte-writes-after-EOF into truncates > on MIGHT_PREALLOCATE, and possibly turn off MIGHT_PREALLOCATE on > other read/writes > 3) Make the kernel fadvise(PREALLOCATE, $filesize) > on MIGHT_PREALLOCATE + lseek(0), turning off the MIGHT_PREALLOCATE > Possibly it might also turn on FADV_SEQUENTIAL. > 4) Make the filesystems optionally preallocate the desired area, or > ignore fadvise(PREALLOCATE, $filesize) instead. > > > [1] http://softwarecommunity.intel.com/articles/eng/1259.htm > -- > It is still called paranoia when they really are out to get you. > > Friß, Spammer: oA@...2dX.7eggert.dyndns.org > CZCkzfiaNb@...gert.dyndns.org nkp@...gert.dyndns.org > So it would be possible, that "Explicit Preallocation" + "Delayed Allocation" + (some other technology) would minimize file-system fragmentation. And further more, massive fragments of large downloads may could be solved by "Explicit Preallocation" too. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists