[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <d662bbd4-ed39-47ef-b2a4-012c765ec4ad@www.fastmail.com>
Date: Fri, 03 Dec 2021 15:51:26 +0000
From: "Jiaxun Yang" <jiaxun.yang@...goat.com>
To: "Dave Hansen" <dave.hansen@...el.com>, x86@...nel.org
Cc: "Thomas Gleixner" <tglx@...utronix.de>,
"Ingo Molnar" <mingo@...hat.com>, "Borislav Petkov" <bp@...en8.de>,
dave.hansen@...ux.intel.com, hpa@...or.com,
chang.seok.bae@...el.com, linux-kernel@...r.kernel.org,
"Jiaxun Yang" <j.yang-87@....ed.ac.uk>
Subject: Re: [RFC PATCH 07/10] x86/fpu: Rellocate fpstate on save_fpregs_to_fpstate
在2021年12月3日十二月 下午3:18,Dave Hansen写道:
> On 12/3/21 3:39 AM, Jiaxun Yang wrote:
>>>> if (likely(use_xsave())) {
>>>> + xstate_update_size(fpu);
>>>> os_xsave(fpu->fpstate);
>>>> update_avx_timestamp(fpu);
>>>> return;
>>> Have you considered what exactly happens when you hit that WARN_ON_FPU()
>>> which otherwise ignores the allocation error? Have you considered what
>>> happens on the os_xsave() that follows it immediately? How about what
>>> happens the next time this task runs after that failure?
>> Thank you for the catch.
>> This is a few questions that I don't have answer, so it's a RFC.
>>
>> I thought it is unlikely to happen as kmalloc has emergency pool.
>> But in case it happens, I guess the best way to handle it is just
>> send SIGILL to corresponding user process or panic if it's kernel
>> fpu use?
>
> We've thought a *LOT* about this exact problem over the past few years.
>
> Intel even added hardware (XFD) to prevent the situation where you land
> in the context switch code, fail a memory allocation, and have to
> destroy user data in registers. Without XFD, there are also zero ways
> to avoid this happening to apps, *other* than preallocating the memory
> in the first place.
>
> I don't think there is *any* viable path forward with this series.
Hmm, actually I can come up some ways to workaround it.
Like we can have some sort of preallocated emergency pool
with max_feature and utilize them in case of allocation failure during context switch.
We'll get some chance to fulfill the pool again after going back from interrupt context :-)
But maybe you are right, it's not for me, a first year undergraduate student,
to comment on solutions from thousands of brilliant brains at Intel.
Appreciate for your comments to let me understand the nature of the problem.
Thanks.
--
- Jiaxun
Powered by blists - more mailing lists