lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 5 Jul 2019 16:47:35 +0100
From:   Andrew Cooper <andrew.cooper3@...rix.com>
To:     Thomas Gleixner <tglx@...utronix.de>,
        LKML <linux-kernel@...r.kernel.org>
CC:     <x86@...nel.org>, Nadav Amit <namit@...are.com>,
        Ricardo Neri <ricardo.neri-calderon@...ux.intel.com>,
        Stephane Eranian <eranian@...gle.com>,
        Feng Tang <feng.tang@...el.com>,
        Andy Lutomirski <luto@...nel.org>
Subject: Re: [patch V2 04/25] x86/apic: Make apic_pending_intr_clear() more
 robust

On 04/07/2019 16:51, Thomas Gleixner wrote:
>   2) The loop termination logic is interesting at best.
>
>      If the machine has no TSC or cpu_khz is not known yet it tries 1
>      million times to ack stale IRR/ISR bits. What?
>
>      With TSC it uses the TSC to calculate the loop termination. It takes a
>      timestamp at entry and terminates the loop when:
>
>      	  (rdtsc() - start_timestamp) >= (cpu_hkz << 10)
>
>      That's roughly one second.
>
>      Both methods are problematic. The APIC has 256 vectors, which means
>      that in theory max. 256 IRR/ISR bits can be set. In practice this is
>      impossible as the first 32 vectors are reserved and not affected and
>      the chance that more than a few bits are set is close to zero.

[Disclaimer.  I talked to Thomas in private first, and he asked me to
post this publicly as the CVE is almost a decade old already.]

I'm afraid that this isn't quite true.

In terms of IDT vectors, the first 32 are reserved for exceptions, but
only the first 16 are reserved in the LAPIC.  Vectors 16-31 are fair
game for incoming IPIs (SDM Vol3, 10.5.2 Valid Interrupt Vectors).

In practice, this makes Linux vulnerable to CVE-2011-1898 / XSA-3, which
I'm disappointed to see wasn't shared with other software vendors at the
time.

Because TPR is 0, an incoming IPI can trigger #AC, #CP, #VC or #SX
without an error code on the stack, which results in a corrupt pt_regs
in the exception handler, and a stack underflow on the way back out,
most likely with a fault on IRET.

These can be addressed by setting TPR to 0x10, which will inhibit
delivery of any errant IPIs in this range, but some extra sanity logic
may not go amiss.  An error code on a 64bit stack can be spotted with
`testb $8, %spl` due to %rsp being aligned before pushing the exception
frame.

Another interesting problem is an IPI which its vector 0x80.  A cunning
attacker can use this to simulate system calls from unsuspecting
positions in userspace, or for interrupting kernel context.  At the very
least the int0x80 path does an unconditional swapgs, so will try to run
with the user gs, and I expect things will explode quickly from there.

One option here is to look at ISR and complain if it is found to be set.

Another option, which I've only just remembered, is that AMD hardware
has the Interrupt Enable Register in its extended APIC space, which may
or may not be good enough to prohibit delivery of 0x80.  There isn't
enough information in the APM to be clear, but the name suggests it is
worth experimenting with.

~Andrew

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ