linux-kernel - Re: [PATCH UPDATED] percpu: use dynamic percpu allocator as the default percpu allocator

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20090414171242.GA4241@elte.hu>
Date:	Tue, 14 Apr 2009 19:12:42 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	Christoph Lameter <cl@...ux.com>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Tejun Heo <tj@...nel.org>,
	Martin Schwidefsky <schwidefsky@...ibm.com>,
	rusty@...tcorp.com.au, tglx@...utronix.de, x86@...nel.org,
	linux-kernel@...r.kernel.org, hpa@...or.com,
	Paul Mundt <lethal@...ux-sh.org>, rmk@....linux.org.uk,
	starvik@...s.com, ralf@...ux-mips.org, davem@...emloft.net,
	cooloney@...nel.org, kyle@...artin.ca, matthew@....cx,
	grundler@...isc-linux.org, takata@...ux-m32r.org,
	benh@...nel.crashing.org, rth@...ddle.net,
	ink@...assic.park.msu.ru, heiko.carstens@...ibm.com,
	Nick Piggin <npiggin@...e.de>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>
Subject: Re: [PATCH UPDATED] percpu: use dynamic percpu allocator as the
	default percpu allocator

* Christoph Lameter <cl@...ux.com> wrote:

> On Tue, 14 Apr 2009, Ingo Molnar wrote:
> 
> > The thing is, i spent well in excess of an hour analyzing your
> > patch, counting cachelines, looking at effects and interactions,
> > thinking about the various implications. I came up with a good deal
> > of factoids, a handful of suggestions and a few summary paragraphs:
> >
> >   http://marc.info/?l=linux-kernel&m=123862536011780&w=2
> 
> Good work.
> 
> > A proper reply to that work would be one of several responses:
> >
> 
> ...
> 
> >
> >   3) agree with the factoids and disagree with my opinion.
> 
> Yep I thought that what I did...

Ok, thanks ... since i never saw a reply from you on that 
mail so i couldnt assume you did so.

There's really 3 key observations in that mail - let me sum 
them up in order of importance.

1)

I'm wondering what your take on the bss+data suggestion is. 
To me it appears it's tempting to merge them into a single 
per .o section: it clearly wins us locality of reference.

It seems so obvious to do to me on a modern SMP kernel - has 
anyone tried that in the past?

Instead of:

  .data1 .data2 .data3   .... .bss1 .bss2 .bss3

we'd have:

  .databss1 .databss2 .databss3

This is clearly better compressed, and the layout is easier 
to control in the .c file. We could also do tricks to 
further compress data here: we could put variables right 
after their __aligned__ locks - while currently they are in 
the .bss wasting a full cache-line.

In the example i analyzed it would reduce the cache 
footprint by one cacheline. This would apply to most .o's so 
the combined effect on cache locality would be significant.

[ Another (sub-)advantage would be that it 'linearizes' and 
  hence properly colors the per .o module variable layout. 
  With an artificially split .data1 .bss1 the offset between 
  them is random, and it's harder to control the cache port
  positions of closely related variables. ]

2)

Aligning (the now merged) data+bss per .o section on 
cacheline boundary [up to 64 byte cacheline sizes or so] 
sounds tempting as well - it eliminates accidental "tail 
bites the next head" type of cross-object-file interactions. 

The price is an estimated 3% blow-up in combined .data+bss 
size. A suspect a patch and measurements would settle this 
pretty neatly.

3)

The free_percpu() collateral-damage argument i made was 
pretty speculative (and artificial as well - the allocation 
of percpu resources is very global in nature so a 
kfree(NULL)-alike fastpath is harder to imagine) - i tried 
at all costs demonstrate my point based on that narrow 
example alone.

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/