linux-kernel - Re: [PATCH] percpu: add optimized generic percpu accessors

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <m1fxjcucp8.fsf@fess.ebiederm.org>
Date:	Wed, 21 Jan 2009 03:21:23 -0800
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	Tejun Heo <tj@...nel.org>
Cc:	Ingo Molnar <mingo@...e.hu>, Rusty Russell <rusty@...tcorp.com.au>,
	Herbert Xu <herbert@...dor.apana.org.au>,
	akpm@...ux-foundation.org, hpa@...or.com, brgerst@...il.com,
	cl@...ux-foundation.org, travis@....com,
	linux-kernel@...r.kernel.org, steiner@....com, hugh@...itas.com,
	"David S. Miller" <davem@...emloft.net>, netdev@...r.kernel.org,
	Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
Subject: Re: [PATCH] percpu: add optimized generic percpu accessors

Tejun Heo <tj@...nel.org> writes:

> Ingo Molnar wrote:
>> The larger point still remains: the kernel dominantly uses static percpu 
>> variables by a margin of 10 to 1, so we cannot just brush away the static 
>> percpu variables and must concentrate on optimizing that side with 
>> priority. It's nice if the dynamic percpu-alloc side improves as well, of 
>> course.
>
> Well, the infrequent usage of dynamic percpu allocation is in some
> part due to the poor implementation, so it's sort of chicken and egg
> problem.  I got into this percpu thing because I wanted a percpu
> reference count which can be dynamically allocated and it sucked.

Counters are our other special case, and counters are interesting
because they are individually very small.  I just looked and the vast
majority of the alloc_percpu users are counters.

I just did a rough count in include/linux/snmp.h  and I came
up with 171*2 counters.  At 8 bytes per counter that is roughly
2.7K.  Or about two thirds of a 4K page.

What makes this is a challenge is that those counters are per network
namespace, and there are no static limits on the number of network
namespaces.

If we push the system and allocate 1024 network namespaces we wind up
needing 2.7MB per cpu, just for the SNMP counters.  

Which nicely illustrates the principle that typically each individual
per cpu allocation is small, but with dynamic allocation we have the
challenge that number of allocations becomes unbounded and in some cases
could be large, while the typical per cpu size is likely to be very small.

I wonder if for the specific case of counters it might make sense to
simply optimize the per cpu allocator for machine word size allocations
and allocate each counter individually freeing us from the burden of
worrying about fragmentation.

The pain with the current alloc_percpu implementation when working
with counters is that it has to allocate an array with one entry
for each cpu to point at the per cpu data.  Which isn't especially efficient.

Eric

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/