lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LFD.2.02.1202191412060.3898@i5.linux-foundation.org>
Date:	Sun, 19 Feb 2012 14:23:05 -0800 (PST)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...hat.com>,
	"H. Peter Anvin" <hpa@...or.com>
cc:	x86@...nel.org,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: [PATCH 0/2] More i387 state save/restore work


Ok, this is a series of two patches that continue my i387 state 
save/restore series, but aren't necessarily worth it for Linux-3.3.

That said, the first one is a bug-fix - but it's an old bug, and I'm not 
sure it can actually be triggered. The failure path for the FP state 
preload is bogus - and always was. But I'm not sure it really *can* fail.

The first one has another small bugfix in it too, and I think that one may 
be new to the rewritten FP state preloading - it doesn't update the 
fpu_counter, so once it starts preloading, it never stops.

I wrote a silly FPU task switch testing program, which basically starts 
two processes pinned to the same CPU, and then uses sched_yield() in both 
to switch back-and-forth between them. *One* of the processes uses the FPU 
between every yield, the other does not. It runs for two seconds, and 
counts how many loops it gets through.

With that test, I get:

 - Plain 3.3-rc4:

   [torvalds@i5 ~]$ uname -r
   3.3.0-rc4
   [torvalds@i5 ~]$ ./a.out ;./a.out ;./a.out ;./a.out ;./a.out ;./a.out ;
   2216090 loops in 2 seconds
   2216922 loops in 2 seconds
   2217148 loops in 2 seconds
   2232191 loops in 2 seconds
   2186203 loops in 2 seconds
   2231614 loops in 2 seconds

 - With the first patch that fixes the FPU preloading to eventually stop:

   [torvalds@i5 ~]$ uname -r
   3.3.0-rc4-00001-g704ed737bd3c
   [torvalds@i5 ~]$ ./a.out ;./a.out ;./a.out ;./a.out ;./a.out ;./a.out ;
   2306667 loops in 2 seconds
   2295760 loops in 2 seconds
   2295494 loops in 2 seconds
   2296282 loops in 2 seconds
   2282229 loops in 2 seconds
   2301842 loops in 2 seconds

 - With the second patch that does the lazy preloading

   [torvalds@i5 ~]$ uname -r
   3.3.0-rc4-00002-g022899d937f9
   [torvalds@i5 ~]$ ./a.out ;./a.out ;./a.out ;./a.out ;./a.out ;./a.out ;
   2466973 loops in 2 seconds
   2456168 loops in 2 seconds
   2449863 loops in 2 seconds
   2461588 loops in 2 seconds
   2478256 loops in 2 seconds
   2476844 loops in 2 seconds

so these things do make some difference. But it is also interesting to see 
from profiles just how expensive setting CR0.TS is (the write to CR0 is 
very expensive indeed), so even when you avoid the FP state restore 
lazily, just setting TS in between task switches is still a big cost of 
FPU save/restore.

Linus Torvalds (2):
  i387: use 'restore_fpu_checking()' directly in task switching code
  i387: support lazy restore of FPU state

 arch/x86/include/asm/i387.h      |   48 +++++++++++++++++++++++++++----------
 arch/x86/include/asm/processor.h |    3 +-
 arch/x86/kernel/cpu/common.c     |    2 +
 arch/x86/kernel/process_32.c     |    2 +-
 arch/x86/kernel/process_64.c     |    2 +-
 arch/x86/kernel/traps.c          |   40 ++++++-------------------------
 6 files changed, 49 insertions(+), 48 deletions(-)

Comments? I feel confident enough about these that I thin kthey might even 
work in 3.3, especially the first one. But I want people to look at 
them.

                     Linus

-- 
1.7.9.188.g12766.dirty

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ