lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <455D0155.9000305@goop.org>
Date:	Thu, 16 Nov 2006 16:24:53 -0800
From:	Jeremy Fitzhardinge <jeremy@...p.org>
To:	Ingo Molnar <mingo@...e.hu>
CC:	Arjan van de Ven <arjan@...radead.org>, Andi Kleen <ak@...e.de>,
	Eric Dumazet <dada1@...mosbay.com>, akpm@...l.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] i386-pda UP optimization

Ingo Molnar wrote:
> what point would there be in using it? It's not like the kernel could 
> make use of the thread keyword anytime soon (it would need /all/ 
> architectures to support it) ...

The plan was to implement the x86 arch-specific percpu stuff to use it,
since it allows gcc better optimisation opportunities.

>  and the kernel doesnt mind how the 
> current per_cpu() primitives are implemented, via assembly or via C. In 
> any case, it very much matters to see the precise cost of having the pda 
> selector value in %gs versus %fs.
>   

Hm, well, unfortunately for me, there is a small but distinct advantage
to using %fs rather than %gs (around 0-5ns per iteration).  The notable
exception being the "AMD-K6(tm) 3D+ Processor", where %gs is about 25%
(15ns) faster.

I'll revise the patches to use %fs and resubmit.

    J

View attachment "results-mixed.txt" of type "text/plain" (3721 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ