lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Tue,  5 May 2015 18:19:51 +0200
From:	Ingo Molnar <mingo@...nel.org>
To:	linux-kernel@...r.kernel.org
Cc:	Andy Lutomirski <luto@...capital.net>,
	Borislav Petkov <bp@...en8.de>,
	Dave Hansen <dave.hansen@...ux.intel.com>,
	Fenghua Yu <fenghua.yu@...el.com>,
	"H. Peter Anvin" <hpa@...or.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Oleg Nesterov <oleg@...hat.com>,
	Thomas Gleixner <tglx@...utronix.de>
Subject: [PATCH 000/208] big x86 FPU code rewrite

Over the past 10 years the x86 FPU has organically grown into
somewhat of a spaghetti monster that few (if any) kernel
developers understand and which code few people enjoy to hack.

Many people suggested over the years that it needs a major cleanup,
and some time ago I went "what the heck" and started doing it step
by step to see where it leads - it cannot be that hard!

Three weeks and 200+ patches later I think I have to admit that I
seriously underestimated the magnitude of the project! ;-)

This work in progress series is large, but it I think makes the
code maintainable and hackable again. It's pretty complete, as
per the 9 high level goals laid out further below. Individual
patches are all finegrained, so should be easy to review - Boris
Petkov already reviewed most of the patches so they are not
entirely raw.

Individual patches have been tested heavily for bisectability, they
were both build and boot on a relatively wide range of x86 hardware
that I have access to. But nevertheless the changes are pretty
invasive, so I'd expect there to be test failures.

This is the only time I intend to post them to lkml in their entirety,
to not spam lkml too much.  (Future additions will be posted as delta
series.)

I'd like to ask interested people to test this tree, and to comment
on the patches. The changes can be found in the following Git tree:

  git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git tmp.fpu

(The tree might be rebased, depending on feedback.)

Here are the main themes that motivated most of the changes:

1)

I collected all FPU code into arch/x86/kernel/fpu/*.c and split it
all up into the following, topically organized source code files:

  -rw-rw-r-- 1 mingo mingo  1423 May  5 16:36 arch/x86/kernel/fpu/bugs.c
  -rw-rw-r-- 1 mingo mingo 12206 May  5 16:36 arch/x86/kernel/fpu/core.c
  -rw-rw-r-- 1 mingo mingo  7342 May  5 16:36 arch/x86/kernel/fpu/init.c
  -rw-rw-r-- 1 mingo mingo 10909 May  5 16:36 arch/x86/kernel/fpu/measure.c
  -rw-rw-r-- 1 mingo mingo  9012 May  5 16:36 arch/x86/kernel/fpu/regset.c
  -rw-rw-r-- 1 mingo mingo 11188 May  5 16:36 arch/x86/kernel/fpu/signal.c
  -rw-rw-r-- 1 mingo mingo 10140 May  5 16:36 arch/x86/kernel/fpu/xstate.c

Similarly I've collected and split up all FPU related header files, and
organized them topically:

  -rw-rw-r-- 1 mingo mingo  1690 May  5 16:35 arch/x86/include/asm/fpu/api.h
  -rw-rw-r-- 1 mingo mingo 12937 May  5 16:36 arch/x86/include/asm/fpu/internal.h
  -rw-rw-r-- 1 mingo mingo   278 May  5 16:36 arch/x86/include/asm/fpu/measure.h
  -rw-rw-r-- 1 mingo mingo   596 May  5 16:35 arch/x86/include/asm/fpu/regset.h
  -rw-rw-r-- 1 mingo mingo  1013 May  5 16:35 arch/x86/include/asm/fpu/signal.h
  -rw-rw-r-- 1 mingo mingo  8137 May  5 16:36 arch/x86/include/asm/fpu/types.h
  -rw-rw-r-- 1 mingo mingo  5691 May  5 16:36 arch/x86/include/asm/fpu/xstate.h

<fpu/api.h> is the only 'public' API left, used in various drivers.

I decoupled drivers and non-FPU x86 code from various FPU internals.

2)

I renamed various internal data types, APIs and helpers, and organized its
support functions accordingly.

For example, all functions that deal with copying FPU registers in and
out of the FPU, are now named consistently:

      copy_fxregs_to_kernel()         # was: fpu_fxsave()
      copy_xregs_to_kernel()          # was: xsave_state()

      copy_kernel_to_fregs()          # was: frstor_checking()
      copy_kernel_to_fxregs()         # was: fxrstor_checking()
      copy_kernel_to_xregs()          # was: fpu_xrstor_checking()
      copy_kernel_to_xregs_booting()  # was: xrstor_state_booting()

      copy_fregs_to_user()            # was: fsave_user()
      copy_fxregs_to_user()           # was: fxsave_user()
      copy_xregs_to_user()            # was: xsave_user()

      copy_user_to_fregs()            # was: frstor_user()
      copy_user_to_fxregs()           # was: fxrstor_user()
      copy_user_to_xregs()            # was: xrestore_user()
      copy_user_to_fpregs_zeroing()   # was: restore_user_xstate()

'xregs'  stands for registers supported by XSAVE
'fxregs' stands for registers supported by FXSAVE
'fregs'  stands for registers supported by FSAVE
'fpregs' stands for generic FPU registers.

Similarly, the high level FPU functions got reorganized as well:

    extern void fpu__activate_curr(struct fpu *fpu);
    extern void fpu__activate_stopped(struct fpu *fpu);
    extern void fpu__save(struct fpu *fpu);
    extern void fpu__restore(struct fpu *fpu);
    extern int  fpu__restore_sig(void __user *buf, int ia32_frame);
    extern void fpu__drop(struct fpu *fpu);
    extern int  fpu__copy(struct fpu *dst_fpu, struct fpu *src_fpu);
    extern void fpu__clear(struct fpu *fpu);
    extern int  fpu__exception_code(struct fpu *fpu, int trap_nr);

Those functions that used to take a task_struct argument now take
the more limited 'struct fpu' argument, and their naming is consistent
and logical as well.

Likewise, the FP state data types are now consistently named as well:

    struct fregs_state;
    struct fxregs_state;
    struct swregs_state;
    struct xregs_state;

    union fpregs_state;

3)

Various core data types got streamlined around four byte flags in 'struct fpu':

  fpu->fpstate_active          # was: tsk->flags & PF_USED_MATH
  fpu->fpregs_active           # was: fpu->has_fpu
  fpu->last_cpu
  fpu->counter

which now fit into a single word.

4)

task->thread.fpu->state got embedded again, as task->thread.fpu.state. This
eliminated a lot of awkward late dynamic memory allocation of FPU state
and the problematic handling of failures.

Note that while the allocation is static right now, this is a WIP interim
state: we can still do dynamic allocation of FPU state, by moving the FPU
state last in task_struct and then allocating task_struct accordingly.

5)

The amazingly convoluted init dependencies got sorted out, into two
cleanly separated families of initialization functions: the
fpu__init_system_*() functions, and the fpu__init_cpu_*() functions.

This allowed the removal of various __init annotation hacks and
obscure boot time checks.

6)

Decoupled the FPU core from the save code. xsave.c and xsave.h got
shrunk quite a bit, and it now hosts only XSAVE/etc. related
functionality, not generic FPU handling functions.

7)

Added a ton of comments explaining how things works and why, hopefully
making this code accessible to everyone interested.

8)

Added FPU debugging code (CONFIG_X86_DEBUG_FPU=y) and added an FPU hw
benchmarking subsystem (CONFIG_X86_DEBUG_FPU_MEASUREMENTS=y), which
performs boot time measurements like:

  x86/fpu:##################################################################
  x86/fpu: Running FPU performance measurement suite (cache hot):
  x86/fpu: Cost of: null                                      :   108 cycles
  x86/fpu:########  CPU instructions:           ############################
  x86/fpu: Cost of: NOP                         insn          :     0 cycles
  x86/fpu: Cost of: RDTSC                       insn          :    12 cycles
  x86/fpu: Cost of: RDMSR                       insn          :   100 cycles
  x86/fpu: Cost of: WRMSR                       insn          :   396 cycles
  x86/fpu: Cost of: CLI                         insn  same-IF :     0 cycles
  x86/fpu: Cost of: CLI                         insn  flip-IF :     0 cycles
  x86/fpu: Cost of: STI                         insn  same-IF :     0 cycles
  x86/fpu: Cost of: STI                         insn  flip-IF :     0 cycles
  x86/fpu: Cost of: PUSHF                       insn          :     0 cycles
  x86/fpu: Cost of: POPF                        insn  same-IF :    20 cycles
  x86/fpu: Cost of: POPF                        insn  flip-IF :    28 cycles
  x86/fpu:########  IRQ save/restore APIs:      ############################
  x86/fpu: Cost of: local_irq_save()            fn            :    20 cycles
  x86/fpu: Cost of: local_irq_restore()         fn    same-IF :    24 cycles
  x86/fpu: Cost of: local_irq_restore()         fn    flip-IF :    28 cycles
  x86/fpu: Cost of: irq_save()+restore()        fn    same-IF :    48 cycles
  x86/fpu: Cost of: irq_save()+restore()        fn    flip-IF :    48 cycles
  x86/fpu:########  locking APIs:               ############################
  x86/fpu: Cost of: smp_mb()                    fn            :    40 cycles
  x86/fpu: Cost of: cpu_relax()                 fn            :     8 cycles
  x86/fpu: Cost of: spin_lock()+unlock()        fn            :    64 cycles
  x86/fpu: Cost of: read_lock()+unlock()        fn            :    76 cycles
  x86/fpu: Cost of: write_lock()+unlock()       fn            :    52 cycles
  x86/fpu: Cost of: rcu_read_lock()+unlock()    fn            :    16 cycles
  x86/fpu: Cost of: preempt_disable()+enable()  fn            :    20 cycles
  x86/fpu: Cost of: mutex_lock()+unlock()       fn            :    56 cycles
  x86/fpu:########  MM instructions:            ############################
  x86/fpu: Cost of: __flush_tlb()               fn            :   132 cycles
  x86/fpu: Cost of: __flush_tlb_global()        fn            :   920 cycles
  x86/fpu: Cost of: __flush_tlb_one()           fn            :   288 cycles
  x86/fpu: Cost of: __flush_tlb_range()         fn            :   412 cycles
  x86/fpu:########  FPU instructions:           ############################
  x86/fpu: Cost of: CR0                         read          :     4 cycles
  x86/fpu: Cost of: CR0                         write         :   208 cycles
  x86/fpu: Cost of: CR0::TS                     fault         :  1156 cycles
  x86/fpu: Cost of: FNINIT                      insn          :    76 cycles
  x86/fpu: Cost of: FWAIT                       insn          :     0 cycles
  x86/fpu: Cost of: FSAVE                       insn          :   168 cycles
  x86/fpu: Cost of: FRSTOR                      insn          :   160 cycles
  x86/fpu: Cost of: FXSAVE                      insn          :    84 cycles
  x86/fpu: Cost of: FXRSTOR                     insn          :    44 cycles
  x86/fpu: Cost of: FXRSTOR                     fault         :   688 cycles
  x86/fpu: Cost of: XSAVE                       insn          :   104 cycles
  x86/fpu: Cost of: XRSTOR                      insn          :    80 cycles
  x86/fpu: Cost of: XRSTOR                      fault         :   884 cycles
  x86/fpu:##################################################################

Based on such measurements we'll be able to do performance tuning,
set default policies and do optimizations in a more informed fashion,
as the speed of various x86 hardware varies a lot.

9)

Reworked many ancient inlining and uninlining decisions based on
modern principles.


Any feedback is welcome!

Thanks,

    Ingo

=====
Ingo Molnar (208):
  x86/fpu: Rename unlazy_fpu() to fpu__save()
  x86/fpu: Add comments to fpu__save() and restrict its export
  x86/fpu: Add debugging check to fpu__save()
  x86/fpu: Rename fpu_detect() to fpu__detect()
  x86/fpu: Remove stale init_fpu() prototype
  x86/fpu: Split an fpstate_alloc_init() function out of init_fpu()
  x86/fpu: Make init_fpu() static
  x86/fpu: Rename init_fpu() to fpu__unlazy_stopped() and add debugging check
  x86/fpu: Optimize fpu__unlazy_stopped()
  x86/fpu: Simplify fpu__unlazy_stopped()
  x86/fpu: Remove fpu_allocated()
  x86/fpu: Move fpu_alloc() out of line
  x86/fpu: Rename fpu_alloc() to fpstate_alloc()
  x86/fpu: Rename fpu_free() to fpstate_free()
  x86/fpu: Rename fpu_finit() to fpstate_init()
  x86/fpu: Rename fpu_init() to fpu__cpu_init()
  x86/fpu: Rename init_thread_xstate() to fpstate_xstate_init_size()
  x86/fpu: Move thread_info::fpu_counter into thread_info::fpu.counter
  x86/fpu: Improve the comment for the fpu::counter field
  x86/fpu: Move FPU data structures to asm/fpu_types.h
  x86/fpu: Clean up asm/fpu/types.h
  x86/fpu: Move i387.c and xsave.c to arch/x86/kernel/fpu/
  x86/fpu: Fix header file dependencies of fpu-internal.h
  x86/fpu: Split out the boot time FPU init code into fpu/init.c
  x86/fpu: Remove unnecessary includes from core.c
  x86/fpu: Move the no_387 handling and FPU detection code into init.c
  x86/fpu: Remove the free_thread_xstate() complication
  x86/fpu: Factor out fpu__flush_thread() from flush_thread()
  x86/fpu: Move math_state_restore() to fpu/core.c
  x86/fpu: Rename math_state_restore() to fpu__restore()
  x86/fpu: Factor out the FPU bug detection code into fpu__init_check_bugs()
  x86/fpu: Simplify the xsave_state*() methods
  x86/fpu: Remove fpu_xsave()
  x86/fpu: Move task_xstate_cachep handling to core.c
  x86/fpu: Factor out fpu__copy()
  x86/fpu: Uninline fpstate_free() and move it next to the allocation function
  x86/fpu: Make task_xstate_cachep static
  x86/fpu: Make kernel_fpu_disable/enable() static
  x86/fpu: Add debug check to kernel_fpu_disable()
  x86/fpu: Add kernel_fpu_disabled()
  x86/fpu: Remove __save_init_fpu()
  x86/fpu: Move fpu_copy() to fpu/core.c
  x86/fpu: Add debugging check to fpu_copy()
  x86/fpu: Print out whether we are doing lazy/eager FPU context switches
  x86/fpu: Eliminate the __thread_has_fpu() wrapper
  x86/fpu: Change __thread_clear_has_fpu() to 'struct fpu' parameter
  x86/fpu: Move 'PER_CPU(fpu_owner_task)' to fpu/core.c
  x86/fpu: Change fpu_owner_task to fpu_fpregs_owner_ctx
  x86/fpu: Remove 'struct task_struct' usage from __thread_set_has_fpu()
  x86/fpu: Remove 'struct task_struct' usage from __thread_fpu_end()
  x86/fpu: Remove 'struct task_struct' usage from __thread_fpu_begin()
  x86/fpu: Open code PF_USED_MATH usages
  x86/fpu: Document fpu__unlazy_stopped()
  x86/fpu: Get rid of PF_USED_MATH usage, convert it to fpu->fpstate_active
  x86/fpu: Remove 'struct task_struct' usage from drop_fpu()
  x86/fpu: Remove task_disable_lazy_fpu_restore()
  x86/fpu: Use 'struct fpu' in fpu_lazy_restore()
  x86/fpu: Use 'struct fpu' in restore_fpu_checking()
  x86/fpu: Use 'struct fpu' in fpu_reset_state()
  x86/fpu: Use 'struct fpu' in switch_fpu_prepare()
  x86/fpu: Use 'struct fpu' in switch_fpu_finish()
  x86/fpu: Move __save_fpu() into fpu/core.c
  x86/fpu: Use 'struct fpu' in __fpu_save()
  x86/fpu: Use 'struct fpu' in fpu__save()
  x86/fpu: Use 'struct fpu' in fpu_copy()
  x86/fpu: Use 'struct fpu' in fpu__copy()
  x86/fpu: Use 'struct fpu' in fpstate_alloc_init()
  x86/fpu: Use 'struct fpu' in fpu__unlazy_stopped()
  x86/fpu: Rename fpu__flush_thread() to fpu__clear()
  x86/fpu: Clean up fpu__clear() a bit
  x86/fpu: Rename i387.h to fpu/api.h
  x86/fpu: Move xsave.h to fpu/xsave.h
  x86/fpu: Rename fpu-internal.h to fpu/internal.h
  x86/fpu: Move MXCSR_DEFAULT to fpu/internal.h
  x86/fpu: Remove xsave_init() __init obfuscation
  x86/fpu: Remove assembly guard from asm/fpu/api.h
  x86/fpu: Improve FPU detection kernel messages
  x86/fpu: Print supported xstate features in human readable way
  x86/fpu: Rename 'pcntxt_mask' to 'xfeatures_mask'
  x86/fpu: Rename 'xstate_features' to 'xfeatures_nr'
  x86/fpu: Move XCR0 manipulation to the FPU code proper
  x86/fpu: Clean up regset functions
  x86/fpu: Rename 'xsave_hdr' to 'header'
  x86/fpu: Rename xsave.header::xstate_bv to 'xfeatures'
  x86/fpu: Clean up and fix MXCSR handling
  x86/fpu: Rename regset FPU register accessors
  x86/fpu: Explain the AVX register layout in the xsave area
  x86/fpu: Improve the __sanitize_i387_state() documentation
  x86/fpu: Rename fpu->has_fpu to fpu->fpregs_active
  x86/fpu: Rename __thread_set_has_fpu() to __fpregs_activate()
  x86/fpu: Rename __thread_clear_has_fpu() to __fpregs_deactivate()
  x86/fpu: Rename __thread_fpu_begin() to fpregs_activate()
  x86/fpu: Rename __thread_fpu_end() to fpregs_deactivate()
  x86/fpu: Remove fpstate_xstate_init_size() boot quirk
  x86/fpu: Remove xsave_init() bootmem allocations
  x86/fpu: Make setup_init_fpu_buf() run-once explicitly
  x86/fpu: Remove 'init_xstate_buf' bootmem allocation
  x86/fpu: Split fpu__cpu_init() into early-boot and cpu-boot parts
  x86/fpu: Make the system/cpu init distinction clear in the xstate code as well
  x86/fpu: Move CPU capability check into fpu__init_cpu_xstate()
  x86/fpu: Move legacy check to fpu__init_system_xstate()
  x86/fpu: Propagate once per boot quirk into fpu__init_system_xstate()
  x86/fpu: Remove xsave_init()
  x86/fpu: Do fpu__init_system_xstate only from fpu__init_system()
  x86/fpu: Set up the legacy FPU init image from fpu__init_system()
  x86/fpu: Remove setup_init_fpu_buf() call from eager_fpu_init()
  x86/fpu: Move all eager-fpu setup code to eager_fpu_init()
  x86/fpu: Move eager_fpu_init() to fpu/init.c
  x86/fpu: Clean up eager_fpu_init() and rename it to fpu__ctx_switch_init()
  x86/fpu: Split fpu__ctx_switch_init() into _cpu() and _system() portions
  x86/fpu: Do CLTS fpu__init_system()
  x86/fpu: Move the fpstate_xstate_init_size() call into fpu__init_system()
  x86/fpu: Call fpu__init_cpu_ctx_switch() from fpu__init_cpu()
  x86/fpu: Do system-wide setup from fpu__detect()
  x86/fpu: Remove fpu__init_cpu_ctx_switch() call from fpu__init_system()
  x86/fpu: Simplify fpu__cpu_init()
  x86/fpu: Factor out fpu__init_cpu_generic()
  x86/fpu: Factor out fpu__init_system_generic()
  x86/fpu: Factor out fpu__init_system_early_generic()
  x86/fpu: Move !FPU check ingo fpu__init_system_early_generic()
  x86/fpu: Factor out FPU bug checks into fpu/bugs.c
  x86/fpu: Make check_fpu() init ordering independent
  x86/fpu: Move fpu__init_system_early_generic() out of fpu__detect()
  x86/fpu: Remove the extra fpu__detect() layer
  x86/fpu: Rename fpstate_xstate_init_size() to fpu__init_system_xstate_size_legacy()
  x86/fpu: Reorder init methods
  x86/fpu: Add more comments to the FPU init code
  x86/fpu: Move fpu__save() to fpu/internals.h
  x86/fpu: Uninline kernel_fpu_begin()/end()
  x86/fpu: Move various internal function prototypes to fpu/internal.h
  x86/fpu: Uninline the irq_ts_save()/restore() functions
  x86/fpu: Rename fpu_save_init() to copy_fpregs_to_fpstate()
  x86/fpu: Optimize copy_fpregs_to_fpstate() by removing the FNCLEX synchronization with FP exceptions
  x86/fpu: Simplify FPU handling by embedding the fpstate in task_struct (again)
  x86/fpu: Remove failure paths from fpstate-alloc low level functions
  x86/fpu: Remove failure return from fpstate_alloc_init()
  x86/fpu: Rename fpstate_alloc_init() to fpstate_init_curr()
  x86/fpu: Simplify fpu__unlazy_stopped() error handling
  x86/fpu, kvm: Simplify fx_init()
  x86/fpu: Simplify fpstate_init_curr() usage
  x86/fpu: Rename fpu__unlazy_stopped() to fpu__activate_stopped()
  x86/fpu: Factor out FPU hw activation/deactivation
  x86/fpu: Simplify __save_fpu()
  x86/fpu: Eliminate __save_fpu()
  x86/fpu: Simplify fpu__save()
  x86/fpu: Optimize fpu__save()
  x86/fpu: Optimize fpu_copy()
  x86/fpu: Optimize fpu_copy() some more on lazy switching systems
  x86/fpu: Rename fpu/xsave.h to fpu/xstate.h
  x86/fpu: Rename fpu/xsave.c to fpu/xstate.c
  x86/fpu: Introduce cpu_has_xfeatures(xfeatures_mask, feature_name)
  x86/fpu: Simplify print_xstate_features()
  x86/fpu: Enumerate xfeature bits
  x86/fpu: Move xfeature type enumeration to fpu/types.h
  x86/fpu, crypto x86/camellia_aesni_avx: Simplify the camellia_aesni_init() xfeature checks
  x86/fpu, crypto x86/sha256_ssse3: Simplify the sha256_ssse3_mod_init() xfeature checks
  x86/fpu, crypto x86/camellia_aesni_avx2: Simplify the camellia_aesni_init() xfeature checks
  x86/fpu, crypto x86/twofish_avx: Simplify the twofish_init() xfeature checks
  x86/fpu, crypto x86/serpent_avx: Simplify the serpent_init() xfeature checks
  x86/fpu, crypto x86/cast5_avx: Simplify the cast5_init() xfeature checks
  x86/fpu, crypto x86/sha512_ssse3: Simplify the sha512_ssse3_mod_init() xfeature checks
  x86/fpu, crypto x86/cast6_avx: Simplify the cast6_init() xfeature checks
  x86/fpu, crypto x86/sha1_ssse3: Simplify the sha1_ssse3_mod_init() xfeature checks
  x86/fpu, crypto x86/serpent_avx2: Simplify the init() xfeature checks
  x86/fpu, crypto x86/sha1_mb: Remove FPU internal headers from sha1_mb.c
  x86/fpu: Move asm/xcr.h to asm/fpu/internal.h
  x86/fpu: Rename sanitize_i387_state() to fpstate_sanitize_xstate()
  x86/fpu: Simplify fpstate_sanitize_xstate() calls
  x86/fpu: Pass 'struct fpu' to fpstate_sanitize_xstate()
  x86/fpu: Rename save_xstate_sig() to copy_fpstate_to_sigframe()
  x86/fpu: Rename save_user_xstate() to copy_fpregs_to_sigframe()
  x86/fpu: Clarify ancient comments in fpu__restore()
  x86/fpu: Rename user_has_fpu() to fpregs_active()
  x86/fpu: Initialize fpregs in fpu__init_cpu_generic()
  x86/fpu: Clean up fpu__clear() state handling
  x86/alternatives, x86/fpu: Add 'alternatives_patched' debug flag and use it in xsave_state()
  x86/fpu: Synchronize the naming of drop_fpu() and fpu_reset_state()
  x86/fpu: Rename restore_fpu_checking() to copy_fpstate_to_fpregs()
  x86/fpu: Move all the fpu__*() high level methods closer to each other
  x86/fpu: Move fpu__clear() to 'struct fpu *' parameter passing
  x86/fpu: Rename restore_xstate_sig() to fpu__restore_sig()
  x86/fpu: Move the signal frame handling code closer to each other
  x86/fpu: Merge fpu__reset() and fpu__clear()
  x86/fpu: Move is_ia32*frame() helpers out of fpu/internal.h
  x86/fpu: Split out fpu/signal.h from fpu/internal.h for signal frame handling functions
  x86/fpu: Factor out fpu/regset.h from fpu/internal.h
  x86/fpu: Remove run-once init quirks
  x86/fpu: Factor out the exception error code handling code
  x86/fpu: Harmonize the names of the fpstate_init() helper functions
  x86/fpu: Create 'union thread_xstate' helper for fpstate_init()
  x86/fpu: Generalize 'init_xstate_ctx'
  x86/fpu: Move restore_init_xstate() out of fpu/internal.h
  x86/fpu: Rename all the fpregs, xregs, fxregs and fregs handling functions
  x86/fpu: Factor out fpu/signal.c
  x86/fpu: Factor out the FPU regset code into fpu/regset.c
  x86/fpu: Harmonize FPU register state types
  x86/fpu: Change fpu->fpregs_active from 'int' to 'char', add lazy switching comments
  x86/fpu: Document the various fpregs state formats
  x86/fpu: Move debugging check from kernel_fpu_begin() to __kernel_fpu_begin()
  x86/fpu/xstate: Don't assume the first zero xfeatures zero bit means the end
  x86/fpu: Clean up xstate feature reservation
  x86/fpu/xstate: Clean up setup_xstate_comp() call
  x86/fpu/init: Propagate __init annotations
  x86/fpu: Pass 'struct fpu' to fpu__restore()
  x86/fpu: Fix the 'nofxsr' boot parameter to also clear X86_FEATURE_FXSR_OPT
  x86/fpu: Add CONFIG_X86_DEBUG_FPU=y FPU debugging code
  x86/fpu: Add FPU performance measurement subsystem
  x86/fpu: Reorganize fpu/internal.h

 Documentation/preempt-locking.txt              |   2 +-
 arch/x86/Kconfig.debug                         |  27 ++
 arch/x86/crypto/aesni-intel_glue.c             |   2 +-
 arch/x86/crypto/camellia_aesni_avx2_glue.c     |  15 +-
 arch/x86/crypto/camellia_aesni_avx_glue.c      |  15 +-
 arch/x86/crypto/cast5_avx_glue.c               |  15 +-
 arch/x86/crypto/cast6_avx_glue.c               |  15 +-
 arch/x86/crypto/crc32-pclmul_glue.c            |   2 +-
 arch/x86/crypto/crc32c-intel_glue.c            |   3 +-
 arch/x86/crypto/crct10dif-pclmul_glue.c        |   2 +-
 arch/x86/crypto/fpu.c                          |   2 +-
 arch/x86/crypto/ghash-clmulni-intel_glue.c     |   2 +-
 arch/x86/crypto/serpent_avx2_glue.c            |  15 +-
 arch/x86/crypto/serpent_avx_glue.c             |  15 +-
 arch/x86/crypto/sha-mb/sha1_mb.c               |   5 +-
 arch/x86/crypto/sha1_ssse3_glue.c              |  16 +-
 arch/x86/crypto/sha256_ssse3_glue.c            |  16 +-
 arch/x86/crypto/sha512_ssse3_glue.c            |  16 +-
 arch/x86/crypto/twofish_avx_glue.c             |  16 +-
 arch/x86/ia32/ia32_signal.c                    |  13 +-
 arch/x86/include/asm/alternative.h             |   6 +
 arch/x86/include/asm/crypto/glue_helper.h      |   2 +-
 arch/x86/include/asm/efi.h                     |   2 +-
 arch/x86/include/asm/fpu-internal.h            | 626 ---------------------------------------
 arch/x86/include/asm/fpu/api.h                 |  48 +++
 arch/x86/include/asm/fpu/internal.h            | 488 ++++++++++++++++++++++++++++++
 arch/x86/include/asm/fpu/measure.h             |  13 +
 arch/x86/include/asm/fpu/regset.h              |  21 ++
 arch/x86/include/asm/fpu/signal.h              |  33 +++
 arch/x86/include/asm/fpu/types.h               | 293 ++++++++++++++++++
 arch/x86/include/asm/{xsave.h => fpu/xstate.h} |  60 ++--
 arch/x86/include/asm/i387.h                    | 108 -------
 arch/x86/include/asm/kvm_host.h                |   2 -
 arch/x86/include/asm/mpx.h                     |   8 +-
 arch/x86/include/asm/processor.h               | 141 +--------
 arch/x86/include/asm/simd.h                    |   2 +-
 arch/x86/include/asm/stackprotector.h          |   2 +
 arch/x86/include/asm/suspend_32.h              |   2 +-
 arch/x86/include/asm/suspend_64.h              |   2 +-
 arch/x86/include/asm/user.h                    |  12 +-
 arch/x86/include/asm/xcr.h                     |  49 ---
 arch/x86/include/asm/xor.h                     |   2 +-
 arch/x86/include/asm/xor_32.h                  |   2 +-
 arch/x86/include/asm/xor_avx.h                 |   2 +-
 arch/x86/include/uapi/asm/sigcontext.h         |   8 +-
 arch/x86/kernel/Makefile                       |   2 +-
 arch/x86/kernel/alternative.c                  |   5 +
 arch/x86/kernel/cpu/bugs.c                     |  57 +---
 arch/x86/kernel/cpu/bugs_64.c                  |   2 +
 arch/x86/kernel/cpu/common.c                   |  29 +-
 arch/x86/kernel/fpu/Makefile                   |  11 +
 arch/x86/kernel/fpu/bugs.c                     |  71 +++++
 arch/x86/kernel/fpu/core.c                     | 509 +++++++++++++++++++++++++++++++
 arch/x86/kernel/fpu/init.c                     | 288 ++++++++++++++++++
 arch/x86/kernel/fpu/measure.c                  | 509 +++++++++++++++++++++++++++++++
 arch/x86/kernel/fpu/regset.c                   | 356 ++++++++++++++++++++++
 arch/x86/kernel/fpu/signal.c                   | 404 +++++++++++++++++++++++++
 arch/x86/kernel/fpu/xstate.c                   | 406 +++++++++++++++++++++++++
 arch/x86/kernel/i387.c                         | 656 ----------------------------------------
 arch/x86/kernel/process.c                      |  52 +---
 arch/x86/kernel/process_32.c                   |  15 +-
 arch/x86/kernel/process_64.c                   |  13 +-
 arch/x86/kernel/ptrace.c                       |  12 +-
 arch/x86/kernel/signal.c                       |  38 ++-
 arch/x86/kernel/smpboot.c                      |   3 +-
 arch/x86/kernel/traps.c                        | 120 ++------
 arch/x86/kernel/xsave.c                        | 724 ---------------------------------------------
 arch/x86/kvm/cpuid.c                           |   2 +-
 arch/x86/kvm/vmx.c                             |   5 +-
 arch/x86/kvm/x86.c                             |  68 ++---
 arch/x86/lguest/boot.c                         |   2 +-
 arch/x86/lib/mmx_32.c                          |   2 +-
 arch/x86/math-emu/fpu_aux.c                    |   4 +-
 arch/x86/math-emu/fpu_entry.c                  |  20 +-
 arch/x86/math-emu/fpu_system.h                 |   2 +-
 arch/x86/mm/mpx.c                              |  15 +-
 arch/x86/power/cpu.c                           |  11 +-
 arch/x86/xen/enlighten.c                       |   2 +-
 drivers/char/hw_random/via-rng.c               |   2 +-
 drivers/crypto/padlock-aes.c                   |   2 +-
 drivers/crypto/padlock-sha.c                   |   2 +-
 drivers/lguest/x86/core.c                      |  12 +-
 lib/raid6/x86.h                                |   2 +-
 83 files changed, 3742 insertions(+), 2841 deletions(-)
 delete mode 100644 arch/x86/include/asm/fpu-internal.h
 create mode 100644 arch/x86/include/asm/fpu/api.h
 create mode 100644 arch/x86/include/asm/fpu/internal.h
 create mode 100644 arch/x86/include/asm/fpu/measure.h
 create mode 100644 arch/x86/include/asm/fpu/regset.h
 create mode 100644 arch/x86/include/asm/fpu/signal.h
 create mode 100644 arch/x86/include/asm/fpu/types.h
 rename arch/x86/include/asm/{xsave.h => fpu/xstate.h} (77%)
 delete mode 100644 arch/x86/include/asm/i387.h
 delete mode 100644 arch/x86/include/asm/xcr.h
 create mode 100644 arch/x86/kernel/fpu/Makefile
 create mode 100644 arch/x86/kernel/fpu/bugs.c
 create mode 100644 arch/x86/kernel/fpu/core.c
 create mode 100644 arch/x86/kernel/fpu/init.c
 create mode 100644 arch/x86/kernel/fpu/measure.c
 create mode 100644 arch/x86/kernel/fpu/regset.c
 create mode 100644 arch/x86/kernel/fpu/signal.c
 create mode 100644 arch/x86/kernel/fpu/xstate.c
 delete mode 100644 arch/x86/kernel/i387.c
 delete mode 100644 arch/x86/kernel/xsave.c

-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ