[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <55A9341B.2050401@linux.intel.com>
Date: Fri, 17 Jul 2015 09:58:03 -0700
From: Dave Hansen <dave.hansen@...ux.intel.com>
To: Ingo Molnar <mingo@...nel.org>
CC: linux-kernel@...r.kernel.org,
Andy Lutomirski <luto@...capital.net>,
Borislav Petkov <bp@...en8.de>,
Fenghua Yu <fenghua.yu@...el.com>,
"H. Peter Anvin" <hpa@...or.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Oleg Nesterov <oleg@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ross Zwisler <ross.zwisler@...ux.intel.com>
Subject: Re: [REGRESSION] 4.2-rc2: early boot memory corruption from FPU rework
On 07/17/2015 12:45 AM, Ingo Molnar wrote:
> Just curious: does any released hardware have AVX-512? I went by Wikipedia, which
> seems to list pre-release hw:
>> We might know the size and composition of the individual components, but we do
>> not know the size of the buffer. Different implementations of a given feature
>> are quite free to have different data stored in the buffer, or even to rearrange
>> or pad it. That's why the sizes are not explicitly called out by the
>> architecture and why we enumerated them before your patch that caused this
>> regression.
>
> But we _have_ to know their structure and layout of the XSAVE context for any
> reasonable ptrace and signal frame support.
There are two different things here. One is the structure and layout
inside of the state components. That obviously needs full kernel
knowledge and can not be opaque, especially when the kernel needs to go
looking at it (like with MPX's BNDCSR for instance).
But, the relative layout of the components is up for grabs. The CPU is
completely free (architecturally) to pad components or rearrange things.
It's not opaque (it's fully enumerated in CPUID), but it's far from
something which is static or which we can realistically represent in a
static structure.
> Can you set/get AVX-512 registers via ptrace? MPX state?
The xsave buffer is just copied out to userspace with REGSET_XSTATE.
Userspace needs to do the same song and dance with CPUID to parse it
that the kernel does.
>> This came out a lot more complicated than I would have liked.
>>
>> Instead of simply enabling all of the XSAVE features that we both know about and
>> the CPU supports, we have to be careful to not overflow our buffer in
>> 'init_fpstate.xsave'.
>
> Yeah, and this can be fixed separately and on top of your fix: my plan during the
> FPU rework was to move the context area to the end of task_struct and size it
> dynamically.
>
> This needs some (very minor) changes to kernel/fork.c to allow an architecture to
> determine the full task_struct size dynamically - but looks very doable and clean.
> Wanna try this, or should I?
I think you already did this later in the thread.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists