linux-kernel - Re: [00/17] Large Blocksize Support V3

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20070427000403.6013d1fa.akpm@linux-foundation.org>
Date:	Fri, 27 Apr 2007 00:04:03 -0700
From:	Andrew Morton <akpm@...ux-foundation.org>
To:	David Chinner <dgc@....com>
Cc:	clameter@....com, linux-kernel@...r.kernel.org,
	Mel Gorman <mel@...net.ie>,
	William Lee Irwin III <wli@...omorphy.com>,
	Jens Axboe <jens.axboe@...cle.com>,
	Badari Pulavarty <pbadari@...il.com>,
	Maxim Levitsky <maximlevitsky@...il.com>
Subject: Re: [00/17] Large Blocksize Support V3

On Fri, 27 Apr 2007 16:09:21 +1000 David Chinner <dgc@....com> wrote:

> On Thu, Apr 26, 2007 at 10:15:28PM -0700, Andrew Morton wrote:
> > On Fri, 27 Apr 2007 14:20:46 +1000 David Chinner <dgc@....com> wrote:
> > 
> > > >    blocksizes via this scheme - instantiate and lock four pages and go for
> > > >    it.
> > > 
> > > So now how do you get block aligned writeback?
> > 
> > in writeback and pageout:
> > 
> > 	if (page->index & mapping->block_size_mask)
> > 		continue;
> 
> So we might do writeback on one page in N - how do we
> make sure none of the other pages are reclaimed while we are doing
> writeback on this bclok?

By marking them all dirty when one is marked dirty.

David, you're perfectly capable of working all this out yourself.  But
you're trying not to.  Please stop this game.

> IOWs, we have to lock every page in the block, mark them all as
> writeback, etc. Instead of doing something once, we have
> to repeat it for every block in page. This is better than a compound
> page, how?

I already said it'd be a bit slower.  But given that those four pageframes
will fall within two cachelines the cost will be small.

> > > Or make sure that truncate
> > > doesn't race on a partial *block* truncate?
> > 
> > lock four pages
> 
> And the locking order? How do you enforce *kernel wide* the
> same locking order for all pages in the same block so that we
> don't get ABBA deadlocks on page locks within a block?

Oh stop it.  You know perfectly well how to do this.  It is trivial.

> i.e:
> 
> > > You basically have to
> > > jump through nasty, nasty hoops, to handle corner cases that are introduced
> > > because the generic code can no longer reliably lock out access to a
> > > filesystem block.
> 
> This way lies insanity.
> 

You're addressing Christoph's straw man here.

> > > way to serialise access to these aggregated structures. This is
> > > the way XFS used to work in it's data path, and we all know how long
> > > and loud people complained about that.....
> > > 
> > > A filesystem specific aggregation mechanism is not a palatable solution
> > > here because it drives filesystems away from being able to use generic
> > > code. 
> > 
> > I would expect we could (should) implement this in generic code by
> > modifying the existing stuff.
> 
> So you're suggesting that we reintroduce a buffer-oriented filesystem
> interface to support large block sizes?

Nothing vaguely like it.  Please be serious.

> > I'm not saying it's especially simple, nor fast.  But it has the advantage
> > that we're not forced to use larger pages with _it's_ attendant performance
> > problems.
> 
> So you'll take slow, inefficient and complex rather than use an
> non-intrusive and /optional/ interface to large pages?

The optionality is useful - it at least means that we can easily remove it
all if/when it becomes obsolete.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/