lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120529180928.GL21339@redhat.com>
Date:	Tue, 29 May 2012 20:09:28 +0200
From:	Andrea Arcangeli <aarcange@...hat.com>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	Hillf Danton <dhillf@...il.com>, Dan Smith <danms@...ibm.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...e.hu>, Paul Turner <pjt@...gle.com>,
	Suresh Siddha <suresh.b.siddha@...el.com>,
	Mike Galbraith <efault@....de>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Lai Jiangshan <laijs@...fujitsu.com>,
	Bharata B Rao <bharata.rao@...il.com>,
	Lee Schermerhorn <Lee.Schermerhorn@...com>,
	Rik van Riel <riel@...hat.com>,
	Johannes Weiner <hannes@...xchg.org>,
	Srivatsa Vaddagiri <vatsa@...ux.vnet.ibm.com>,
	Christoph Lameter <cl@...ux.com>
Subject: Re: [PATCH 13/35] autonuma: add page structure fields

On Tue, May 29, 2012 at 10:38:34AM -0700, Linus Torvalds wrote:
> A big fraction of one percent is absolutely unacceptable.
> 
> Our "struct page" is one of our biggest memory users, there's no way
> we should cavalierly make it even bigger.
> 
> It's also a huge performance sink, the cache miss on struct page tends
> to be one of the biggest problems in managing memory. We may not ever
> fix that, but making struct page bigger certainly isn't going to help
> the bad cache behavior.

The cache effects on the VM fast paths shouldn't be altered, and no
additional memory per-page is allocated when booting the same bzImage
on not NUMA hardware.

But now when booted on NUMA hardware it takes 8 bytes more than
before. There are 32 bytes allocated for every page (with autonuma13
it was only 24 bytes). The struct page itself isn't modified.

I want to remove the page pointer from the page_autonuma structure, to
keep the overhead at 0.58% instead of the current 0.78% (like it was
on autonuma alpha13 before the page_autonuma introduction). That
shouldn't be difficult and it's the next step.

Those changes aren't visible to anything but *autonuma.* files and the
cache misses in accessing the page_autonuma structure shouldn't be
measurable (the only fast path access is from
autonuma_free_page). Even if we find a way to shrink it below 0.58%,
it won't be intrusive over the rest of the kernel.

memcg takes 0.39% on every system built with
CONFIG_CGROUP_MEM_RES_CTLR=y unless the kernel is booted with
cgroup_disable=memory (and nobody does).

I'll do my best to shrink it further, like mentioned I'm very willing
to experiment with a fixed size array in function of the RAM per node,
to reduce the overhead (Michel and Rik suggested that at MM summit
too). Maybe it'll just work fine even if the max size of the lru is
reduced by a factor of 10. In the worst case I personally believe lots
of people would be ok to pay 0.58% considering they're paying 0.39%
even on much smaller not-NUMA systems to boot with memcg. And I'm sure
I can reduce it at least to 0.58% without any downside.

It's lots of work to reduce it below 0.58%, so before doing that I
believe it's fair enough to do enough performance measurement and
reviews to be sure the design flies.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ