linux-kernel - Re: [00/17] Large Blocksize Support V3

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20070427060921.GA77450368@melbourne.sgi.com>
Date:	Fri, 27 Apr 2007 16:09:21 +1000
From:	David Chinner <dgc@....com>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	David Chinner <dgc@....com>, clameter@....com,
	linux-kernel@...r.kernel.org, Mel Gorman <mel@...net.ie>,
	William Lee Irwin III <wli@...omorphy.com>,
	Jens Axboe <jens.axboe@...cle.com>,
	Badari Pulavarty <pbadari@...il.com>,
	Maxim Levitsky <maximlevitsky@...il.com>
Subject: Re: [00/17] Large Blocksize Support V3

On Thu, Apr 26, 2007 at 10:15:28PM -0700, Andrew Morton wrote:
> On Fri, 27 Apr 2007 14:20:46 +1000 David Chinner <dgc@....com> wrote:
> 
> > >    blocksizes via this scheme - instantiate and lock four pages and go for
> > >    it.
> > 
> > So now how do you get block aligned writeback?
> 
> in writeback and pageout:
> 
> 	if (page->index & mapping->block_size_mask)
> 		continue;

So we might do writeback on one page in N - how do we
make sure none of the other pages are reclaimed while we are doing
writeback on this bclok?

IOWs, we have to lock every page in the block, mark them all as
writeback, etc. Instead of doing something once, we have
to repeat it for every block in page. This is better than a compound
page, how?

> > Or make sure that truncate
> > doesn't race on a partial *block* truncate?
> 
> lock four pages

And the locking order? How do you enforce *kernel wide* the
same locking order for all pages in the same block so that we
don't get ABBA deadlocks on page locks within a block?

i.e:

> > You basically have to
> > jump through nasty, nasty hoops, to handle corner cases that are introduced
> > because the generic code can no longer reliably lock out access to a
> > filesystem block.

This way lies insanity.

> > way to serialise access to these aggregated structures. This is
> > the way XFS used to work in it's data path, and we all know how long
> > and loud people complained about that.....
> > 
> > A filesystem specific aggregation mechanism is not a palatable solution
> > here because it drives filesystems away from being able to use generic
> > code. 
> 
> I would expect we could (should) implement this in generic code by
> modifying the existing stuff.

So you're suggesting that we reintroduce a buffer-oriented filesystem
interface to support large block sizes?

> I'm not saying it's especially simple, nor fast.  But it has the advantage
> that we're not forced to use larger pages with _it's_ attendant performance
> problems.

So you'll take slow, inefficient and complex rather than use an
non-intrusive and /optional/ interface to large pages?

Words fail me......

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/