linux-kernel - Re: [GIT PATCH] x86,percpu: fix pageattr handling with remap allocator

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <4A0D3A390200007800001081@vpn.id2.novell.com>
Date:	Fri, 15 May 2009 08:47:37 +0100
From:	"Jan Beulich" <JBeulich@...ell.com>
To:	"Tejun Heo" <tj@...nel.org>
Cc:	<mingo@...e.hu>, <andi@...stfloor.org>, <tglx@...utronix.de>,
	<linux-kernel@...r.kernel.org>, <hpa@...or.com>
Subject: Re: [GIT PATCH] x86,percpu: fix pageattr handling with remap	
	 allocator

>>> Tejun Heo <tj@...nel.org> 14.05.09 17:55 >>>
>> In order to reduce the amount of work to do during lookup as well as
>> the chance of having a collision at all, wouldn't it be reasonable
>> to use as much of an allocated 2/4M page as possible rather than
>> returning whatever is left after a single CPU got its per-CPU memory
>> chunk from it? I.e. you'd return only those (few) pages that either
>> don't fit another CPU's chunk anymore or that are left after running
>> through all CPUs.
>> 
>> Or is there some hidden requirement that each CPU's per-CPU area must
>> start on a PMD boundary?
>
>The whole point of doing the remapping is giving each CPU its own PMD
>mapping for perpcu area, so, yeah, that's the requirement.  I don't
>think the requirement is hidden tho.

No, from looking at the code the requirement seems to only be that you
get memory allocated from the correct node and mapped by a large page.
There's nothing said why the final virtual address would need to be large
page aligned. I.e., with a slight modification to take the NUMA requirement
into account (I noticed I ignored that aspect after I had already sent that
mail), the previous suggestion would still appear usable to me.

>How hot is the cpa path?  On my test systems, there were only few
>calls during init and then nothing.  Does it become very hot if, for
>example, GEM is used?  But I really don't think the log2 binary search
>overhead would be anything noticeable compared to TLB shootdown and
>all other stuff going on there.

I would view cutting down on that only as a nice side effect, not a primary
reason to do the change. The primary reason is this:

>> This would additionally address a potential problem on 32-bits -
>> currently, for a 32-CPU system you consume half of the vmalloc space
>> with PAE (on non-PAE you'd even exhaust it, but I think it's
>> unreasonable to expect a system having 32 CPUs to not need PAE).
>
>I recall having about the same conversation before.  Looking up...
>
>-- QUOTE --
>  Actually, I've been looking at the numbers and I'm not sure if the
>  concern is valid.  On x86_32, the practical number of maximum
>  processors would be around 16 so it will end up 32M, which isn't
>  nice and it would probably a good idea to introduce a parameter to
>  select which allocator to use but still it's far from consuming all
>  the VM area.  On x86_64, the vmalloc area is obscenely large at 245,
>  ie 32 terabytes.  Even with 4096 processors, single chunk is measly
>  0.02%.

Just to note - there must be a reason we (SuSE/Novell) build our default
32-bit kernel with support for 128 CPUs, which now is simply broken.

>  If it's a problem for other archs or extreme x86_32 configurations,
>  we can add some safety measures but in general I don't think it is a
>  problem.
>-- END OF QUOTE --
>
>So, yeah, if there are 32bit 32-way NUMA machines out there, it would
>be wise to skip remap allocator on such machines.  Maybe we can
>implement a heuristic - something like "if vm area consumption goes
>over 25%, don't use remap".

Possibly, as a secondary consideration on top of the suggested reduction
of virtual address space consumption.

Jan

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/