lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 04 Nov 2006 11:09:42 -0800
From:	Zachary Amsden <zach@...are.com>
To:	Chuck Ebbert <76306.1226@...puserve.com>
Cc:	Linus Torvalds <torvalds@...l.org>, Andi Kleen <ak@...e.de>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	Benjamin LaHaise <bcrl@...ck.org>
Subject: Re: [rfc patch] i386: don't save eflags on task switch

Chuck Ebbert wrote:
> In-Reply-To: <Pine.LNX.4.64.0611031645141.25218@...osdl.org>
>
> On Fri, 3 Nov 2006 16:46:25 -0800, Linus Torvalds wrote:
>
>   
>> On Fri, 3 Nov 2006, Chuck Ebbert wrote:
>>     
>>> There is no real need to save eflags in switch_to().  Instead,
>>> we can keep a constant value in the thread_struct and always
>>> restore that.
>>>       
>> I don't really see the point. The "pushfl" isn't the expensive part, and 
>> it gives sane and expected semantics.
>>
>> The "popfl" is the expensive part, and that's the thing that can't really 
>> even be removed.
>>     
>
> Well that wasn't the impression I got:
>
>   Date: Mon, 18 Sep 2006 12:12:51 -0400
>   From: Benjamin LaHaise <bcrl@...ck.org>
>   Subject: Re: Sysenter crash with Nested Task Bit set
>
>   ...
>
>   It's the pushfl that will be slow on any OoO CPU, as it has dependancies on 
>   any previous instructions that modified the flags, which ends up bringing 
>   all of the memory ordering dependancies into play.  Doing a popfl to set the 
>   flags to some known value is much less expensive.
>   

That doesn't sound correct to me.  The popf is far more expensive.  
There is no popfl $IMM instruction, so setting flags never can avoid the 
memory read and must make some more expensive assumptions about effects 
on further instruction stream (TF, DF, all sign flags for conditional 
jumps...).

Every processor I've ever measured it on, popf is slower.  On P4, for 
example, pushf is 6 cycles, and popf is 54.  On Opteron, it is 2 / 12.  
On Xeon, it is 7 / 91.

Zach
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists