linux-kernel - Re: #tj-percpu has been rebased

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <200902161753.14141.rusty@rustcorp.com.au>
Date:	Mon, 16 Feb 2009 17:53:13 +1030
From:	Rusty Russell <rusty@...tcorp.com.au>
To:	Tejun Heo <tj@...nel.org>
Cc:	Ingo Molnar <mingo@...e.hu>, "H. Peter Anvin" <hpa@...or.com>,
	Thomas Gleixner <tglx@...utronix.de>, x86@...nel.org,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Jeremy Fitzhardinge <jeremy@...p.org>, cpw@....com
Subject: Re: #tj-percpu has been rebased

On Saturday 14 February 2009 11:15:14 Tejun Heo wrote:
> Rusty Russell wrote:
> > On Thursday 12 February 2009 14:14:08 Tejun Heo wrote:
> >> Oops, those are the same ones.  I'll give a shot at cooking up
> >> something which can be dynamically sized before going forward with
> >> this one.
> > 
> > That's why I handed it to you! :)
> > 
> > Just remember we waited over 5 years for this to happen: the point of these
> > is that Christoph showed it's still useful.
> > 
> > (And I really like the idea of allocing congruent areas rather than remapping
> >  if someone can show that it's semi-reliable.  Good luck!)
> 
> I finished writing up the first draft last night.  Somehow I can feel
> long grueling debugging hours ahead of me but it generally goes like
> the following.
> 
> Percpu areas are allocated in chunks in vmalloc area.  Each chunk is
> consisted of num_possible_cpus() units and the first chunk is used for
> static percpu variables in the kernel image (special boot time
> alloc/init handling necessary as these areas need to be brought up
> before allocation services are running).  Unit grows as necessary and
> all units grow or shrink in unison.  When a chunk is filled up,
> another chunk is allocated.  ie. in vmalloc area
> 
>   c0                           c1                         c2           
>    -------------------          -------------------        ------------
>   | u0 | u1 | u2 | u3 |        | u0 | u1 | u2 | u3 |      | u0 | u1 | u
>    -------------------  ......  -------------------  ....  ------------
> 
> Allocation is done in offset-size areas of single unit space.  Ie,
> when UNIT_SIZE is 128k, an area at 134k of 512bytes occupy 512bytes at
> 6k of c1:u0, c1:u1, c1:u2 and c1u3.  Percpu access can be done by
> configuring percpu base registers UNIT_SIZE apart.
> 
> Currently it uses pte mappings but byn using larger UNIT_SIZE, it can
> be modified to use pmd mappings.  I'm a bit skeptical about this tho.
> Percpu pages are allocated with HIGHMEM | COLD, so they won't
> interfere with the physical mapping and on !NUMA it lifts load from
> pgd tlb by not having stuff for different cpus occupying the same pgd
> page.

Not sure I understand all of this, but it sounds like a straight virtual
mapping with some chosen separation between the mappings.

But note that for the non-NUMA case, you can just use kmalloc/__get_free_pages
and no remapping tricks are necessary at all.

Thanks,
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/