linux-kernel - Re: [PATCH] x86: setup: extend low identity map to cover whole kernel range

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALCETrU=YL8yWpp29xO0N7TEVogX1j5Fyk5M_FpJTa9ZOS21Zw@mail.gmail.com>
Date:	Wed, 14 Oct 2015 14:39:58 -0700
From:	Andy Lutomirski <luto@...capital.net>
To:	Matt Fleming <matt@...eblueprint.co.uk>
Cc:	Paolo Bonzini <pbonzini@...hat.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"H. Peter Anvin" <hpa@...or.com>, X86 ML <x86@...nel.org>,
	stable <stable@...r.kernel.org>,
	Laszlo Ersek <lersek@...hat.com>,
	Matt Fleming <matt.fleming@...el.com>,
	Borislav Petkov <bp@...e.de>,
	"linux-efi@...r.kernel.org" <linux-efi@...r.kernel.org>
Subject: Re: [PATCH] x86: setup: extend low identity map to cover whole kernel range

On Wed, Oct 14, 2015 at 2:00 PM, Matt Fleming <matt@...eblueprint.co.uk> wrote:
> On Wed, 14 Oct, at 09:22:03AM, Andy Lutomirski wrote:
>> On Wed, Oct 14, 2015 at 6:52 AM, Matt Fleming <matt@...eblueprint.co.uk> wrote:
>> > (Pulling in luto for low-level x86 fu)
>> >
>> > On Wed, 14 Oct, at 01:30:45PM, Paolo Bonzini wrote:
>> >> On 32-bit systems, the initial_page_table is reused by
>> >> efi_call_phys_prolog as an identity map to call
>> >> SetVirtualAddressMap.  efi_call_phys_prolog takes care of
>> >> converting the current CPU's GDT to a physical address too.
>> >>
>> >> For PAE kernels the identity mapping is achieved by aliasing the
>> >> first PDPE for the kernel memory mapping into the first PDPE
>> >> of initial_page_table.  This makes the EFI stub's trick "just work".
>> >>
>> >> However, for non-PAE kernels there is no guarantee that the identity
>> >> mapping in the initial_page_table extends as far as the GDT; in this
>> >> case, accesses to the GDT will cause a page fault (which quickly becomes
>> >> a triple fault).  Fix this by copying the kernel mappings from
>> >> swapper_pg_dir to initial_page_table twice, both at PAGE_OFFSET and at
>> >> identity mapping.
>> >
>> > Oops, good catch guys. This is clearly a bug, but...
>> >
>> >> For some reason, this is only reproducible with QEMU's dynamic translation
>> >> mode, and not for example with KVM.  However, even under KVM one can clearly
>> >> see that the page table is bogus:
>>
>> I haven't looked at the code, but it wouldn't surprise me if this is
>> some kind of TLB issue.  With the hardware TLB (which is in use on
>> KVM), it seems quite likely that the GDT is pretty much always in the
>> TLB and, if nothing flushes global mappings, then it'll probably stick
>> around.
>
> From some quick experiments it appears that you can skate past this
> issue if you don't receive any interrupts while the bogus GDT pointer
> is loaded, or if you avoid reloading the segment registers in general.
> Which is interesting because I assumed that writing to GDTR took
> immediate effect.

Trivia for your amusement:

AFAICT it's entirely permissible for the GDTR and/or LDT descriptor to
point to unmapped memory.  Any attempt to use them (segment loads,
interrupts, IRET, etc) will try to access that memory as if the access
came from CPL 0 and, if the access fails, will generate a valid page
fault with CR2 pointing into the GDT or LDT.

Xen is nuts^Wclever and actually uses this.

Of course, if your #PF vector references a GDT or LDT descriptor and
trying to load that descriptor results in a page fault, you get a
double fault.

I learned this while trying to puzzle out why v1 of my LDT
synchronization patch caused random faults on Xen.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/