linux-kernel - Re: [PATCH 0/2] x86/fred: Fix two problems during the FRED initialization

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <f8ab27ff-d6cf-41c7-a033-9e578b1e51fb@zytor.com>
Date: Tue, 25 Jun 2024 02:09:29 -0700
From: Xin Li <xin@...or.com>
To: Hou Wenlong <houwenlong.hwl@...group.com>
Cc: linux-kernel@...r.kernel.org, Lai Jiangshan <jiangshan.ljs@...group.com>,
        Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>,
        Borislav Petkov <bp@...en8.de>,
        Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
        "H. Peter Anvin" <hpa@...or.com>, Xin Li
 <xin3.li@...el.com>,
        Jacob Pan <jacob.jun.pan@...ux.intel.com>,
        Rick Edgecombe <rick.p.edgecombe@...el.com>,
        Paolo Bonzini <pbonzini@...hat.com>
Subject: Re: [PATCH 0/2] x86/fred: Fix two problems during the FRED
 initialization

On 6/23/2024 11:21 PM, Hou Wenlong wrote:
> On Sat, Jun 22, 2024 at 08:31:26AM +0800, Xin Li wrote:
>> On 6/21/2024 6:12 AM, Hou Wenlong wrote:
>>> One issue is that FRED can be disabled in trap_init(), but
>>> sysvec_install() can be called before trap_init(), thus the system
>>> interrupt handler is not installed into the IDT if FRED is disabled
>>> later. Initially, I attempted to parse the cmdline and decide whether to
>>> enable or disable FRED after parse_early_param(). However, I ultimately
>>> chose to always install the system handler into the IDT in
>>> sysvec_install(), which is simple and should be sufficient.
>>
>> Which module with a system vector gets initialized before trap_init() is
>> called?
>>
> Sorry, I didn't mention the case here. I see sysvec_install() is used
> only in the guest part (KVM, HYPERV) currently. For example, the KVM
> guest will register the HYPERVISOR_CALLBACK_VECTOR APF handler in
> kvm_guest_init(), which is called before trap_init(). So if only the FRED
> handler is registered and FRED is disabled in trap_init() later, then the
> IDT handler of the APF handler is missing.

This is a bug!  Your fix looks good to me.

>>> Another problem is that the page fault handler (exc_page_fault()) is
>>> installed into the IDT before FRED is enabled. Consequently, if a #PF is
>>> triggered in this gap, the handler would receive the wrong CR2 from the
>>> stack if FRED feature is present. To address this, I added a page fault
>>> entry stub for FRED similar to the debug entry. However, I'm uncertain
>>> whether this is enough reason to add a new entry. Perhaps a static key
>>> may suffice to indicate whether FRED setup is completed and the handler
>>> can use it.
>>
>> How could a #PF get triggered during that gap?
>>
>> Initialization time funnies are really unpleasant.
>>
> I'm not sure if there will be a #PF during that gap; I just received the
> wrong fault address when I made a mistake in that gap and a #PF
> occurred. Before idt_setup_early_pf(), the registered page fault handler
> is do_early_exception(), which uses native_read_cr2(). However, after
> that, the page fault handler had been changed to exc_page_fault(), so if
> something bad happened and an unexpected #PF occurred, the fault address
> in the error output will be wrong, although the CR2 in __show_regs() is
> correct. I'm not sure if this matters or not since the kernel will panic
> at that time.

So this doesn't sound a real problem, right?

We could simply do:

diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index e6c469b323cc..e500777ed2b4 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -1495,7 +1495,7 @@ DEFINE_IDTENTRY_RAW_ERRORCODE(exc_page_fault)
            irqentry_state_t state;
            unsigned long address;

-       address = cpu_feature_enabled(X86_FEATURE_FRED) ? 
fred_event_data(regs) : read_cr2();
+       address = native_read_cr4() & X86_CR4_FRED ? 
fred_event_data(regs) : read_cr2();

            prefetchw(&current->mm->mmap_lock);


But the page fault handler is an extreme hot path, it's not worth it.

Thanks!
        Xin