lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <55A6FC31.5010102@linux.intel.com>
Date:	Wed, 15 Jul 2015 17:34:57 -0700
From:	Dave Hansen <dave.hansen@...ux.intel.com>
To:	Ingo Molnar <mingo@...nel.org>
CC:	linux-kernel@...r.kernel.org,
	Andy Lutomirski <luto@...capital.net>,
	Borislav Petkov <bp@...en8.de>,
	Fenghua Yu <fenghua.yu@...el.com>,
	"H. Peter Anvin" <hpa@...or.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Oleg Nesterov <oleg@...hat.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ross Zwisler <ross.zwisler@...ux.intel.com>
Subject: [REGRESSION] 4.2-rc2: early boot memory corruption from FPU rework

On 07/15/2015 04:07 AM, Ingo Molnar wrote:
> * Dave Hansen <dave.hansen@...ux.intel.com> wrote:
>>>  	/*
>>> -	 * Setup init_xstate_buf to represent the init state of
>>> +	 * Setup init_xstate_ctx to represent the init state of
>>>  	 * all the features managed by the xsave
>>>  	 */
>>> -	init_xstate_buf = alloc_bootmem_align(xstate_size,
>>> -					      __alignof__(struct xsave_struct));
>>> -	fx_finit(&init_xstate_buf->i387);
>>> +	fx_finit(&init_xstate_ctx.i387);
>>
>> This is causing memory corruption in 4.2-rc2.
...
>> This patch works around the problem, btw:
>>
>> 	https://www.sr71.net/~dave/intel/bloat-xsave-gunk-2.patch
> 
> Yeah, so I got this prototype hardware boot crash reported in private mail and 
> decoded it and after some debugging I suggested the +PAGE_SIZE hack - possibly you 
> got that hack from the same person?

Nope, I came up with that gem of a patch all on my own.

I also wouldn't characterize this as prototype hardware.  There are
obviously plenty of folks depending on mainline to boot and function on
hardware that has AVX-512 support.  That's why two different Intel folks
came to you independently.

> My suggestion was to solve this properly: if we list xstate features as supported 
> then we should size their max size correctly. The AVX bits are currently not 
> properly enumerated and sized - and I refuse to add feature support to the kernel 
> where per task CPU state fields that the kernel saves/restores are opaque...

We might know the size and composition of the individual components, but
we do not know the size of the buffer.  Different implementations of a
given feature are quite free to have different data stored in the
buffer, or even to rearrange or pad it.  That's why the sizes are not
explicitly called out by the architecture and why we enumerated them
before your patch that caused this regression.

The component itself may not be opaque, but the size of the *buffer* is
not a simple sum of the component sizes.  Here's a real-world example:

[    0.000000] x86/fpu: xstate_offset[2]: 0240, xstate_sizes[2]: 0100
[    0.000000] x86/fpu: xstate_offset[3]: 03c0, xstate_sizes[3]: 0040

Notice that component 3 is not at 0x240+0x100.  This means our existing
init_xstate_size(), and why any attempt to staticlly-size the buffer is
broken.

I understand why you were misled by it, but the old "xsave_hdr_struct"
was wrong.  Fenghua even posted patches to remove it before the FPU
rework (you were cc'd):

	https://lkml.org/lkml/2015/4/18/164

> So please add proper AVX512 support structures to fpu/types.h and size 
> XSTATE_RESERVE correctly - or alternatively we can remove the current incomplete 
> AVX512 bits.

The old code sized the buffer in a fully architectural way and it
worked.  The CPU *tells* you how much memory the 'xsave' instruction is
going to scribble on.  The new code just merrily calls it and let it
scribble away.  This is as clear-cut a regression as I've ever seen.

The least we can do is detect that the kernel undersized the buffer and
disable support for the features that do not fit.  A very lightly tested
patch to do that is attached.  I'm not super eager to put that in to an
-rc2 kernel though.

View attachment "xsaves-init-buf.patch" of type "text/x-patch" (7855 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ