lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <49962413.9020101@zytor.com>
Date:	Fri, 13 Feb 2009 17:53:23 -0800
From:	"H. Peter Anvin" <hpa@...or.com>
To:	Tejun Heo <tj@...nel.org>
CC:	Rusty Russell <rusty@...tcorp.com.au>, Ingo Molnar <mingo@...e.hu>,
	Thomas Gleixner <tglx@...utronix.de>, x86@...nel.org,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Jeremy Fitzhardinge <jeremy@...p.org>, cpw@....com
Subject: Re: #tj-percpu has been rebased

Tejun Heo wrote:
> 
> Percpu areas are allocated in chunks in vmalloc area.  Each chunk is
> consisted of num_possible_cpus() units and the first chunk is used for
> static percpu variables in the kernel image (special boot time
> alloc/init handling necessary as these areas need to be brought up
> before allocation services are running).  Unit grows as necessary and
> all units grow or shrink in unison.  When a chunk is filled up,
> another chunk is allocated.  ie. in vmalloc area
> 
>   c0                           c1                         c2           
>    -------------------          -------------------        ------------
>   | u0 | u1 | u2 | u3 |        | u0 | u1 | u2 | u3 |      | u0 | u1 | u
>    -------------------  ......  -------------------  ....  ------------
> 
> Allocation is done in offset-size areas of single unit space.  Ie,
> when UNIT_SIZE is 128k, an area at 134k of 512bytes occupy 512bytes at
> 6k of c1:u0, c1:u1, c1:u2 and c1u3.  Percpu access can be done by
> configuring percpu base registers UNIT_SIZE apart.
> 

Okay, let's think about this a bit.

At least for x86, there are two cases:

- 32 bits.  The vmalloc area is *extremely* constrained, and has the 
same class of fragmentation issues as main memory.  In fact, it might 
have *more* just by virtue of being larger.

- 64 bits.  At this point, we have with current memory sizes(*) an 
astronomically large virtual space.  Here we have no real problem 
allocating linearly in virtual space, either by giving each CPU some 
very large hunk of virtual address space (which means each percpu area 
is contiguous in virtual space) or by doing large contiguous allocations 
out of another range.

It doesn't seem to make sense to me at first glance to be any advantage 
to interlacing the CPUs.  Quite on the contrary, it seems to utterly 
preclude ever doing PMDs with a win, since (a) you'd be allocating real 
memory for CPUs which aren't actually there and (b) you'd have the wrong 
NUMA associativity.

	-hpa


(*) In about 20 years we better get the remaining virtual address bits...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ