linux-kernel - Re: [GIT PATCH] x86,percpu: fix pageattr handling with remap allocator

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4A14B271.5010202@kernel.org>
Date:	Thu, 21 May 2009 10:46:25 +0900
From:	Tejun Heo <tj@...nel.org>
To:	suresh.b.siddha@...el.com
CC:	"H. Peter Anvin" <hpa@...or.com>,
	"JBeulich@...ell.com" <JBeulich@...ell.com>,
	"andi@...stfloor.org" <andi@...stfloor.org>,
	"mingo@...e.hu" <mingo@...e.hu>,
	"linux-kernel-owner@...r.kernel.org" 
	<linux-kernel-owner@...r.kernel.org>,
	"tglx@...utronix.de" <tglx@...utronix.de>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [GIT PATCH] x86,percpu: fix pageattr handling with remap allocator

Hello,

Suresh Siddha wrote:
>> The dynamic onlining will probably use 4k pages so, yeah, it won't
>> have the alias issues but that's not the issue here, right?  You can
>> already avoid aliasing that way by simply using 4k allocator from the
>> get-go.
> 
> But now that I learnt about dynamic online allocation, can we avoid the
> complexity brought by this patchset, by simply using 4k allocator from
> get-go.

Sure, we can.

> i.e., can we drop this remap pageattr handling patchset and simply use
> 4k mapping for now? And move to dynamic allocation at a later point.

4k or not, x86 is already on dynamic allocation.  The only difference
is how the first chunk is allocated.

> This will simplify quite a bit of code.

Yes it will.  The question is which way would be better.  Till now,
there hasn't been any actual data on how remap compares to 4k.  The
only thing we know is that, on UMA, embed should behave exactly the
same for static percpu variables as before the whole dynamic
allocator.

On NUMA, both remap and 4k add some level of TLB pressure.  remap will
waste one more PMD TLB entry (dup) while 4k adds a bunch of 4k ones
(non-dup but what used to be accessed by PMD TLB is now accessed with
PTE TLB).  Some say using one more PMD TLB is better while others
disagree.  So, the best course of action here seems to offer both and
easy way to select between them so that data can be gathered, which is
what this patchset does.

I don't think the added complexity for cpa() justifies dropping remap
without further testing.  The added complexity isn't that big.  Most
of the confusion in this patchset came from my ignorance on the
subject.  cpa() is a fragile thing but we need it anyway, so...

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/