lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070302062950.GG15867@wotan.suse.de>
Date:	Fri, 2 Mar 2007 07:29:50 +0100
From:	Nick Piggin <npiggin@...e.de>
To:	Christoph Lameter <clameter@...r.sgi.com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Mel Gorman <mel@...net.ie>, mingo@...e.hu,
	jschopp@...tin.ibm.com, arjan@...radead.org,
	torvalds@...ux-foundation.org, mbligh@...igh.org,
	linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: The performance and behaviour of the anti-fragmentation related patches

On Thu, Mar 01, 2007 at 10:19:48PM -0800, Christoph Lameter wrote:
> On Fri, 2 Mar 2007, Nick Piggin wrote:
> 
> > > >From the I/O controller and from the application. 
> > 
> > Why doesn't the application need to deal with TLB entries?
> 
> Because it may only operate on a small section of the file and hopefully 
> splice the rest through? But yes support for mmapped I/O would be 
> necessary.

So you're talking about copying a file from one location to another?


> > > This would only be a temporary fix pushing the limits to the double or so?
> > 
> > And using slightly larger page sizes isn't?
> 
> There was no talk about slightly. 1G page size would actually be quite 
> convenient for some applications.

But it is far from convenient for the kernel. So we have hugepages, so
we can stay out of the hair of those applications and they can stay out
of hours.

> > > Amortized? The controller still would have to hunt down the 4kb page 
> > > pieces that we have to feed him right now. Result: Huge scatter gather 
> > > lists that may themselves create issues with higher page order.
> > 
> > What sort of numbers do you have for these controllers that aren't
> > very good at doing sg?
> 
> Writing a terabyte of memory to disk with handling 256 billion page 
> structs? In case of a system with 1 petabyte of memory this may be rather 
> typical and necessary for the application to be able to save its state
> on disk.

But you will have newer IO controllers, faster CPUs...

Is it a problem or isn't it? Waving around the 256 billion number isn't
impressive because it doesn't really say anything.

> > Isn't the issue was something like your IO controllers have only a
> > limited number of sg entries, which is fine with 16K pages, but with
> > 4K pages that doesn't give enough data to cover your RAID stripe?
> > 
> > We're never going to do a variable sized pagecache just because of that.
> 
> No, we need support for larger page sizes than 16k. 16k has not been fine 
> for a couple of years. We only agreed to 16k because that was the common 
> consensus. Best performance was always at 64k 4 years ago (but then we 
> have no numbers for higher page sizes yet). Now we would prefer much 
> larger sizes.

But you are in a tiny minority, so it is not so much a question of what
you prefer, but what you can make do with without being too intrusive.

I understand you have controllers (or maybe it is a block layer limit)
that doesn't work well with 4K pages, but works OK with 16K pages.
This is not something that we would introduce variable sized pagecache
for, surely.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ