lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110725111224.GP28787@elte.hu>
Date:	Mon, 25 Jul 2011 13:12:24 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	Andy Lutomirski <luto@....EDU>
Cc:	x86 <x86@...nel.org>, linux-kernel@...r.kernel.org,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Arjan van de Ven <arjan@...radead.org>,
	Avi Kivity <avi@...hat.com>
Subject: Re: [PATCH 3.1?] x86: Remove useless stts/clts pair in __switch_to


* Andy Lutomirski <luto@....EDU> wrote:

> An stts/clts pair takes over 70 ns by itself on Sandy Bridge, and
> when other things are going on it's apparently even worse.  This
> saves 10% on context switches between threads that both use extended
> state.
> 
> Signed-off-by: Andy Lutomirski <luto@....edu>
> Cc: Linus Torvalds <torvalds@...ux-foundation.org>
> Cc: Arjan van de Ven <arjan@...radead.org>, 
> Cc: Avi Kivity <avi@...hat.com>
> ---
> 
> This is not as well tested as it should be (especially on 32-bit, where
> I haven't actually tried compiling it), but I think this might be 3.1
> material so I want to get it out for review before it's even more
> unjustifiably late :)
> 
> Argument for inclusion in 3.1 (after a bit more testing):
>  - It's dead simple.
>  - It's a 10% speedup on context switching under the right conditions [1]
>  - It's unlikely to slow any workload down, since it doesn't add any work
>    anywwhere.
> 
> Argument against:
>  - It's late.

I think it's late.

Would be much better to stick it into the x86/xsave tree i pointed to 
and treat and debug it as a coherent unit. FPU bugs need a lot of 
time to surface so we definitely do not want to fast-track it. In 
fact if we want it in v3.2 we should start assembling the tree right 
now.

Also, if you are tempted by the prospect of possibly enabling vector 
instructions for the x86 kernel, we could try that too, and get 
multiple speedups for the price of having to debug the tree only once 
;-)

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ