linux-kernel - Re: [PATCH v1 1/1] x86/fred: Fix the FRED RSP0 MSR out of sync with its per CPU cache

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <8f488a03-876b-481a-a20d-5d372fbe1dcf@zytor.com>
Date: Wed, 8 Jan 2025 15:32:36 -0800
From: Xin Li <xin@...or.com>
To: Dave Hansen <dave.hansen@...el.com>, linux-kernel@...r.kernel.org
Cc: tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
        dave.hansen@...ux.intel.com, x86@...nel.org, hpa@...or.com,
        andrew.cooper3@...rix.com
Subject: Re: [PATCH v1 1/1] x86/fred: Fix the FRED RSP0 MSR out of sync with
 its per CPU cache

On 1/8/2025 3:04 PM, Dave Hansen wrote:
> On 1/7/25 18:36, Xin Li (Intel) wrote:
>> The FRED RSP0 MSR (pointing to the top of the kernel stack for user
>> level event delivery) and its per CPU cache should be kept in sync to
>> avoid redundant writes in the exit to user space path, as a result,
>> a write to the FRED RSP0 MSR is paired with a write to its per CPU
>> cache as fred_update_rsp0() does.
> 
> I _think_ you're trying to explain the general use of a per-cpu MSR
> cache. That's good. But I was reading this paragraph and thinking at
> this point that the bug had something to do with redundant writes to the
> MSR.
> 
> How about this?
> 
> 	The FRED RSP0 MSR is only used for delivering events when
> 	running userspace. This kernel leverages this property to reduce
> 	expensive MSR writes and optimize context switches.  The kernel
> 	only writes the MSR when about to run userspace *and* when the
> 	MSR has actually changed since the last time userspace ran.
> 
> 	This optimization is implemented by maintaining a per-cpu cache
> 	of FRED RSP0 and then checking that against the value for the
> 	current task's stack before running userspace.
> 
> 	However cpu_init_fred_exceptions() writes the MSR without
> 	updating the per-cpu cache. This means that the kernel might
> 	return to userspace with MSR_IA32_FRED_RSP0==0 when it needed to
> 	point to the current task stack. This would induce a double
> 	fault (#DF), which is bad.
> 
> 	A context switch after cpu_init_fred_exceptions() can paper over
> 	the issue since it updates the cached value. That evidently
> 	happens most of the time explaining how this bug got through.
> 

This is a way better explanation, thanks a lot!