lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 25 Feb 2009 02:30:17 +1100
From:	Nick Piggin <nickpiggin@...oo.com.au>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	Tejun Heo <tj@...nel.org>, rusty@...tcorp.com.au,
	tglx@...utronix.de, x86@...nel.org, linux-kernel@...r.kernel.org,
	hpa@...or.com, jeremy@...p.org, cpw@....com,
	ink@...assic.park.msu.ru
Subject: Re: [PATCHSET x86/core/percpu] improve the first percpu chunk allocation

On Wednesday 25 February 2009 02:19:20 Ingo Molnar wrote:
> * Tejun Heo <tj@...nel.org> wrote:
> > Hi,
> >
> > Ingo Molnar wrote:
> > --snip--
> >
> > > So what i'm saying is that these are strong reasons for us to
> > > want to make the unit size to be something like 2MB - on 64-bit
> > > x86 at least.
> > >
> > > ( Using a 2MB unit size will also have another advantage: _iff_
> > >   we can still allocate a hugepage at that point we can map it
> > >   straight there when extending the dynamic area. )
> >
> > Thanks for the explanation.  Yeap, it would be nice to have
> > units aligned on 2MB boundary.  We'll need to add @align to vm
> > area alloc function to do it correctly.  As for using large
> > page, it would be nice if we can do that automatically.
> > Upfront 2MB unit allocation is probably too expensive but
> > merging 4k pages into a large page (if we can get them) will
> > add a lot of irregular latency too.  Hmmm...
>
> Yeah, largepage support - if we ever get there (the chances of
> finding a proper 2MB aligned 2MB sized chunk of physical memory
> are not very good except the first few minutes of uptime),
> should indeed be automatic to all get_vm_area() users -
> vmalloc(), ioremap() and now percpu.c.

The problem is that it doesn't always know what the callers want.
It would be trivial (there is already support in the virtual
address allocator) to specify alignment. Then kernel page table
setup could presumably use larger size mappings if it is given
contiguous memory.


> I think a far more realistic angle to utilize more of the 2MB
> TLB will be to gradually increase PERCPU_ENOUGH_ROOM, as we
> observe more and more percpu_alloc() sites in the kernel. Right
> now it's pretty rare so going beyond the 8K we do for modules
> would probably be a waste of RAM.

It might possibly be useful for smaller NUMA machines using
hashdist for the big early hashes (hash sizes scale log, but
number of nodes tends to scale linearly with memory size, so
2MB per node for hashes would probably be far too much on big
machines).

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ