lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070426182456.GA13066@infradead.org>
Date:	Thu, 26 Apr 2007 19:24:56 +0100
From:	Christoph Hellwig <hch@...radead.org>
To:	Jens Axboe <jens.axboe@...cle.com>
Cc:	Christoph Hellwig <hch@...radead.org>,
	Christoph Lameter <clameter@....com>,
	"Eric W. Biederman" <ebiederm@...ssion.com>,
	Nick Piggin <nickpiggin@...oo.com.au>,
	David Chinner <dgc@....com>, linux-kernel@...r.kernel.org,
	Mel Gorman <mel@...net.ie>,
	William Lee Irwin III <wli@...omorphy.com>,
	Badari Pulavarty <pbadari@...il.com>,
	Maxim Levitsky <maximlevitsky@...il.com>
Subject: Re: [00/17] Large Blocksize Support V3

On Thu, Apr 26, 2007 at 08:12:51PM +0200, Jens Axboe wrote:
> > Exactly.  But the only counter-proposal we have so far seems far worse :)
> 
> Lets look at some numbers. I'll just concentrate on the scatterlist,
> since the bio_vec is smaller. On x86 32-bit, the scatterlist is 20 bytes
> long. If we accept that 2^1 allocations are ok (they should be), then we
> can support ~1.6mb ios just like that.
> 
> My approach would be to support scatterlist chaining. Essentially you'd
> have the last element of the sglist pointing to the next array of
> entries. We can then stick to 128 entry arrays which fit nicely in a
> single page allocation and easily support >> 2mb ios. The only caveat is
> that you'd need to update the drivers to get there, since a regular
> iteration over the array isn't enough. My plan was to add an sglist
> iterator helper that hides this from the drivers, if they need to loop
> over the scatterlist. Things like {dma/pci}_map_sg() would of course be
> updated.
> 
> The above can be implemented fairly cleanly, and on a need-to-have
> basis. It's not something that'll break drivers.
> 
> What do you think?

Purely for the I/O sizes to external arrays problem that's nice,
and I think we (well, you :)) should implement it.

But there's other reasons why larger objects in the page cache make
sense that are mostly related to keeping overhead for large files
in the operating system down.  So I'd go both for s/g list chaining
and variable order pagecache.

Btw, we should talk a little about the sglist iterators on linux-scsi,
as a lot of the dma mapping API will need updates for bidirection dmas
anyway, and we should try to get everything done in one rush.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ