lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 13 Jan 2009 01:01:12 +0100
From:	Ingo Molnar <mingo@...e.hu>
To:	Mike Travis <travis@....com>
Cc:	Frederik Deweerdt <frederik.deweerdt@...og.eu>,
	andi@...stfloor.org, tglx@...utronix.de, hpa@...or.com,
	linux-kernel@...r.kernel.org
Subject: Re: [patch] tlb flush_data: replace per_cpu with an array


* Mike Travis <travis@....com> wrote:

> Frederik Deweerdt wrote:
> > On Mon, Jan 12, 2009 at 02:54:25PM -0800, Mike Travis wrote:
> >> Frederik Deweerdt wrote:
> >>> Hi,
> >>>
> >>> On x86_64 flush tlb data is stored in per_cpu variables. This is
> >>> unnecessary because only the first NUM_INVALIDATE_TLB_VECTORS entries
> >>> are accessed.
> >>> This patch aims at making the code less confusing (there's nothing
> >>> really "per_cpu") by using a plain array. It also would save some memory
> >>> on most distros out there (Ubuntu x86_64 has NR_CPUS=64 by default).
> >>>
> >>> Regards,
> >>> Frederik
> >>>
> >>> Signed-off-by: Frederik Deweerdt <frederik.deweerdt@...og.eu>
> >> Here is the net change in memory usage with this patch on a allyesconfig
> >> with NR_CPUS=4096.
> 
> > Yes, this point wrt. memory was based on my flawed understanding of how
> > per_cpu actually allocates the data. There is however 1) a confusing use
> > of per_cpu removed, 2) faster access to the flush data.
> 
> Is this true?  On a widely separated NUMA system, requiring all CPU's to 
> access memory on NODE 0 for every tlb flush would seem expensive.  
> That's another benefit of per_cpu data, it's local to the node's cpus.

That's irrelevant here: we only have 8 IPI slots for the TLB flush, so 
we'll use the first 8 per_cpu areas (note how all access is cpu-nr modulo 
8 in essence). That is going to be node 0 anyway in most NUMA setups.

> (And was it determined yet, that a cacheline has to be tossed around as 
> well?)

No, that assertion of Andi is wrong.

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ