[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <c2da6e0b-aad1-4151-a8d3-07afb149b1c5@rbox.co>
Date: Sun, 10 Dec 2023 18:08:43 +0100
From: Michal Luczaj <mhal@...x.co>
To: Borislav Petkov <bp@...en8.de>
Cc: x86@...nel.org, tglx@...utronix.de, mingo@...hat.com,
dave.hansen@...ux.intel.com, shuah@...nel.org, luto@...nel.org,
torvalds@...uxfoundation.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 0/2] x86: UMIP emulation leaking kernel addresses
On 12/9/23 16:53, Borislav Petkov wrote:
> On Wed, Dec 06, 2023 at 01:43:43AM +0100, Michal Luczaj wrote:
>> Introducing a DPL check in insn_get_seg_base(), or even in get_desc(),
>> seems enough to prevent the decoder from disclosing data.
>>
>> diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
>> index 558a605929db..4c1eea736519 100644
>> --- a/arch/x86/lib/insn-eval.c
>> +++ b/arch/x86/lib/insn-eval.c
>> @@ -725,6 +725,18 @@ unsigned long insn_get_seg_base(struct pt_regs *regs, int seg_reg_idx)
>> if (!get_desc(&desc, sel))
>> return -1L;
>>
>> + /*
>> + * Some segment selectors coming from @regs do not necessarily reflect
>> + * the state of CPU; see get_segment_selector(). Their values might
>> + * have been altered by ptrace. Thus, the instruction decoder can be
>> + * tricked into "dereferencing" a segment descriptor that would
>> + * otherwise cause a CPU exception -- for example due to mismatched
>> + * privilege levels. This opens up the possibility to expose kernel
>> + * space base address of DPL=0 segments.
>> + */
>> + if (desc.dpl < (regs->cs & SEGMENT_RPL_MASK))
>> + return -1L;
>> +
>> return get_desc_base(&desc);
>> }
>>
>> That said, I guess instead of trying to harden the decoder,
>
> Well, here's what my CPU manual says:
>
> "4.10.1 Accessing Data Segments
>
> ...
>
> The processor compares the effective privilege level with the DPL in the
> descriptor-table entry referenced by the segment selector. If the
> effective privilege level is greater than or equal to (numerically
> lower-than or equal-to) the DPL, then the processor loads the segment
> register with the data-segment selector.
>
> If the effective privilege level is lower than (numerically
> greater-than) the DPL, a general-protection exception (#GP) occurs and
> the segment register is not loaded.
>
> ...
>
> 4.10.2 Accessing Stack Segments
>
> The processor compares the CPL with the DPL in the descriptor-table
> entry referenced by the segment selector. The two values must be equal.
> If they are not equal, a #GP occurs and the SS register is not loaded."
>
> So *actually* doing those checks in the insn decoder is the proper thing
> to do, IMNSVHO.
Are you suggesting checking only CPL vs. DPL or making sure the insn
decoder faithfully emulates all the other stuff CPU does? Like the desc.s
issue described below.
>> Now, I'm far from being competent, but here's an idea I've tried: tell
>> the #GP handler that UMIP-related exceptions come only as #GP(0):
>>
>> if (static_cpu_has(X86_FEATURE_UMIP)) {
>> - if (user_mode(regs) && fixup_umip_exception(regs))
>> + if (user_mode(regs) && !error_code && fixup_umip_exception(regs))
>> goto exit;
>> }
>
> And yap, as you've realized, that alone doesn't fix the leaking.
With this fix applied, I can't see any way to sufficiently confuse the
UMIP emulation with a non-ESPFIX bad IRET. It appears that #GP(selector)
takes precedence over #GP(0), so tripping IRET with any malformed selector
always ends up with #GP handler's error_code != 0, even if conditions were
met for #GP(0) just as well. Is there something I'm missing?
That said, there's still the case of #DF handler feeding #GP handler after
a fault in ESPFIX. Consider
cs = (GDT_ENTRY_TSS << 3) | USER_RPL
ss = (SOME_LDT_ENTRY << 3) | SEGMENT_LDT | USER_RPL
ip = "sgdt %cs:(%reg)"
Attempting IRET with such illegal CS raises #GP(selector), but (because of
ESPFIX) this #GP is promoted to #DF where it becomes #GP(0). And UMIP
emulation is triggered.
UMIP emulator starts by fetching code from user. insn decoder does
`copy_from_user(buf, (void __user *)ip, MAX_INSN_SIZE)` where `ip` is
CS.base+IP and CS.base here is actually a (part of) GDT_ENTRY_TSS.base, a
kernel address. With IP under user's control, no fault while copying.
Next, insn_get_code_seg_params() concludes that, given TSS as a code
segment, address and operand size are both 16-bit. Prefix SGDT with size
overrides, and we're back to 32-bit. Then insn_get_addr_ref() and
copy_to_user() does the leaking.
I don't know if/how to deal with ESPFIX losing #GP's error code. As for
telling insn decoder that system segments aren't code:
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -809,6 +809,10 @@ int insn_get_code_seg_params(struct pt_regs *regs)
if (!get_desc(&desc, sel))
return -EINVAL;
+ /* System segments are not code. */
+ if (!desc.s)
+ return -EINVAL;
+
/*
* The most significant byte of the Type field of the segment descriptor
* determines whether a segment contains data or code. If this is a data
Is this something in the right direction?
(Note, get_segment_selector() is broken for selectors with the high bit
set. I'll send patch later.)
thanks,
Michal
Powered by blists - more mailing lists