linux-kernel - Re: [00/17] Large Blocksize Support V3

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <m1tzupkzmw.fsf@ebiederm.dsl.xmission.com>
Date:	Sun, 06 May 2007 22:48:23 -0600
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	David Chinner <dgc@....com>
Cc:	Theodore Tso <tytso@....edu>,
	Andrew Morton <akpm@...ux-foundation.org>, clameter@....com,
	linux-kernel@...r.kernel.org, Mel Gorman <mel@...net.ie>,
	William Lee Irwin III <wli@...omorphy.com>,
	Jens Axboe <jens.axboe@...cle.com>,
	Badari Pulavarty <pbadari@...il.com>,
	Maxim Levitsky <maximlevitsky@...il.com>
Subject: Re: [00/17] Large Blocksize Support V3

David Chinner <dgc@....com> writes:

> On Fri, May 04, 2007 at 07:33:54AM -0600, Eric W. Biederman wrote:
>> >
>> > So while the jury is out about how many other filesystems might use
>> > it, I suspect it's more than you might think.  At the very least,
>> > there may be some IA64 users who might be trying to transition their
>> > way to x86_64, and have existing filesystems using a 8k or 16k
>> > block filesystems.  :-)
>> 
>> How much of a problem would it be if those blocks were not necessarily
>> contiguous in RAM, but placed in normal 4K pages in the page cache?
>
> If you need to treat the block in a contiguous range, then you need to
> vmap() the discontiguous pages. That has substantial overhead if you
> have to do it regularly.

Which is why I would prefer not to do it.  I think vmap is not really
compatible with the design of the linux page cache.

Although we can't even count on the pages being mapped into low
memory right now and have to call kmap if we want to access them
so things might not be that bad.  Even if it was a multipage kmap
type operation.

> We do this in xfs_buf.c for > page size blocks - the overhead that
> caused when operating on inode clusters resulted in us doing some
> pointer fiddling and directly addresing the contents of each page
> to avoid the vmap overhead. See xfs_buf_offset() and friends....
>
>> I expect meta data operations would have to be modified but that otherwise
>> you would not care.
>
> I think you might need to modify the copy-in and copy-out operations
> substantially (e.g. prepare_/commit_write()) as they assume a buffer doesn't
> span multple pages.....

But in a filesystem like ext2 except for a zeroing some unused hunks
of the page all that really happens is you setup for DMA straight out
of the page cache.  So this is primarily an issue for meta-data.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/