[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8f488a03-876b-481a-a20d-5d372fbe1dcf@zytor.com>
Date: Wed, 8 Jan 2025 15:32:36 -0800
From: Xin Li <xin@...or.com>
To: Dave Hansen <dave.hansen@...el.com>, linux-kernel@...r.kernel.org
Cc: tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
dave.hansen@...ux.intel.com, x86@...nel.org, hpa@...or.com,
andrew.cooper3@...rix.com
Subject: Re: [PATCH v1 1/1] x86/fred: Fix the FRED RSP0 MSR out of sync with
its per CPU cache
On 1/8/2025 3:04 PM, Dave Hansen wrote:
> On 1/7/25 18:36, Xin Li (Intel) wrote:
>> The FRED RSP0 MSR (pointing to the top of the kernel stack for user
>> level event delivery) and its per CPU cache should be kept in sync to
>> avoid redundant writes in the exit to user space path, as a result,
>> a write to the FRED RSP0 MSR is paired with a write to its per CPU
>> cache as fred_update_rsp0() does.
>
> I _think_ you're trying to explain the general use of a per-cpu MSR
> cache. That's good. But I was reading this paragraph and thinking at
> this point that the bug had something to do with redundant writes to the
> MSR.
>
> How about this?
>
> The FRED RSP0 MSR is only used for delivering events when
> running userspace. This kernel leverages this property to reduce
> expensive MSR writes and optimize context switches. The kernel
> only writes the MSR when about to run userspace *and* when the
> MSR has actually changed since the last time userspace ran.
>
> This optimization is implemented by maintaining a per-cpu cache
> of FRED RSP0 and then checking that against the value for the
> current task's stack before running userspace.
>
> However cpu_init_fred_exceptions() writes the MSR without
> updating the per-cpu cache. This means that the kernel might
> return to userspace with MSR_IA32_FRED_RSP0==0 when it needed to
> point to the current task stack. This would induce a double
> fault (#DF), which is bad.
>
> A context switch after cpu_init_fred_exceptions() can paper over
> the issue since it updates the cached value. That evidently
> happens most of the time explaining how this bug got through.
>
This is a way better explanation, thanks a lot!
Powered by blists - more mailing lists