lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c7a819fd-6644-0a83-497f-46e2044bbc00@citrix.com>
Date:   Fri, 5 Jul 2019 21:47:09 +0100
From:   Andrew Cooper <andrew.cooper3@...rix.com>
To:     Nadav Amit <namit@...are.com>
CC:     Thomas Gleixner <tglx@...utronix.de>,
        LKML <linux-kernel@...r.kernel.org>,
        "x86@...nel.org" <x86@...nel.org>,
        Ricardo Neri <ricardo.neri-calderon@...ux.intel.com>,
        Stephane Eranian <eranian@...gle.com>,
        Feng Tang <feng.tang@...el.com>,
        Andy Lutomirski <luto@...nel.org>,
        Andrew Cooper <Andrew.Cooper3@...rix.com>
Subject: Re: [patch V2 04/25] x86/apic: Make apic_pending_intr_clear() more
 robust

On 05/07/2019 20:19, Nadav Amit wrote:
>> On Jul 5, 2019, at 8:47 AM, Andrew Cooper <andrew.cooper3@...rix.com> wrote:
>>
>> On 04/07/2019 16:51, Thomas Gleixner wrote:
>>>  2) The loop termination logic is interesting at best.
>>>
>>>     If the machine has no TSC or cpu_khz is not known yet it tries 1
>>>     million times to ack stale IRR/ISR bits. What?
>>>
>>>     With TSC it uses the TSC to calculate the loop termination. It takes a
>>>     timestamp at entry and terminates the loop when:
>>>
>>>     	  (rdtsc() - start_timestamp) >= (cpu_hkz << 10)
>>>
>>>     That's roughly one second.
>>>
>>>     Both methods are problematic. The APIC has 256 vectors, which means
>>>     that in theory max. 256 IRR/ISR bits can be set. In practice this is
>>>     impossible as the first 32 vectors are reserved and not affected and
>>>     the chance that more than a few bits are set is close to zero.
>> [Disclaimer.  I talked to Thomas in private first, and he asked me to
>> post this publicly as the CVE is almost a decade old already.]
>>
>> I'm afraid that this isn't quite true.
>>
>> In terms of IDT vectors, the first 32 are reserved for exceptions, but
>> only the first 16 are reserved in the LAPIC.  Vectors 16-31 are fair
>> game for incoming IPIs (SDM Vol3, 10.5.2 Valid Interrupt Vectors).
>>
>> In practice, this makes Linux vulnerable to CVE-2011-1898 / XSA-3, which
>> I'm disappointed to see wasn't shared with other software vendors at the
>> time.
> IIRC (and from skimming the CVE again) the basic problem in Xen was that
> MSIs can be used when devices are assigned to generate IRQs with arbitrary
> vectors. The mitigation was to require interrupt remapping to be enabled in
> the IOMMU when IOMMU is used for DMA remapping (i.e., device assignment).
>
> Are you concerned about this case, additional concrete ones, or is it about
> security hardening? (or am I missing something?)

The phrase "impossible as the first 32 vectors are reserved" stuck out,
because its not true.  That generally means that any logic derived from
it is also false. :)

In practice, I was thinking more about robustness against buggy
conditions.  Setting TPR to 1 at start of day is very easy.  Some of the
other protections, less so.

When it comes to virtualisation, security is an illusion when a guest
kernel has a real piece of hardware in its hands.  Anyone who is under
the misapprehension otherwise should try talking to a IOMMU hardware
engineer and see the reaction on their face.  IOMMUs were designed to
isolate devices when all controlling software was of the same privilege
level.  They don't magically make the system safe against a hostile
guest device driver, which in the most basic case, can still mount a DoS
attempt with deliberately bad DMA.

~Andrew

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ