lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20191115021119.GB18745@guptapadev.amr>
Date:   Thu, 14 Nov 2019 18:11:19 -0800
From:   Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>
To:     Dave Hansen <dave.hansen@...el.com>
Cc:     Nadav Amit <nadav.amit@...il.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Jan Kiszka <jan.kiszka@...mens.com>,
        linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
        Ralf Ramsauer <ralf.ramsauer@...-regensburg.de>,
        "Gupta, Pawan Kumar" <pawan.kumar.gupta@...el.com>,
        kirill.shutemov@...ux.intel.com
Subject: Re: [FYI PATCH 0/7] Mitigation for CVE-2018-12207

On Wed, Nov 13, 2019 at 09:26:24PM -0800, Dave Hansen wrote:
> On 11/13/19 5:17 PM, Nadav Amit wrote:
> > But is it always the case? Looking at __split_large_page(), it seems that the
> > TLB invalidation is only done after the PMD is changed. Can't this leave a
> > small time window in which a malicious actor triggers a machine-check on 
> > another core than the one that runs __split_large_page()?
> 
> It's not just a split.  It has to be a change that results in
> inconsistencies between two entries in the TLB.  A normal split doesn't
> change the resulting final translations and is never inconsistent
> between the two translations.
> 
> To have an inconsistency, you need to change the backing physical
> address (or cache attributes?).  I'd need to go double-check the erratum
> to be sure about the cache attributes.
> 
> In any case, that's why we decided that normal kernel mapping
> split/merges don't need to be mitigated.  But, we should probably
> document this somewhere if it's not clear.
> 
> Pawan, did we document the results of the audit you did anywhere?

Kirill Shutemov did the heavy lifting, thank you Kirill. Below were the
major areas probed: 

1. Can a non-privileged user application induce this erratum?

	Userspace can trigger switching between 4k and 2M (in both
	directions), but kernel already follows the protocol to avoid
	this issue due to similar errata in AMD CPUs. [1][2]

2. If kernel can accidentally induce this?

	__split_large_page() in arch/x86/mm/pageattr.c was the suspect [3]. 

	The locking scheme described in the comment only guarantees that
	TLB entries for 4k and 2M/1G will have the same page attributes
	until TLB flush. There is nothing that would protect from having
	multiple TLB entries of different sizes with the same attributes.

	But the erratum can be triggered only when:

		Software modifies the paging structures so that the same
		linear address is translated using a large page (2 MB, 4
		MB, or 1 GB) with a different physical address or memory
		type.

	And in this case the physical address and memory type is
	preserved until TLB is flushed, so it should be safe.

Thanks,
Pawan

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/mm/huge_memory.c#n2190
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/mm/khugepaged.c#n1038
[3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/mm/pageattr.c#n1020

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ