linux-kernel - Re: [PATCH] x86: Use entire page for the per-cpu GDT only if paravirt-enabled

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20150929090112.GA1400@gmail.com>
Date:	Tue, 29 Sep 2015 11:01:12 +0200
From:	Ingo Molnar <mingo@...nel.org>
To:	Denys Vlasenko <dvlasenk@...hat.com>
Cc:	"H. Peter Anvin" <hpa@...or.com>,
	Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
	Boris Ostrovsky <boris.ostrovsky@...cle.com>,
	David Vrabel <david.vrabel@...rix.com>,
	Joerg Roedel <joro@...tes.org>, Gleb Natapov <gleb@...nel.org>,
	Paolo Bonzini <pbonzini@...hat.com>, kvm@...r.kernel.org,
	x86@...nel.org, linux-kernel@...r.kernel.org,
	Andy Lutomirski <luto@...nel.org>,
	Brian Gerst <brgerst@...il.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Borislav Petkov <bp@...en8.de>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Kees Cook <keescook@...omium.org>
Subject: Re: [PATCH] x86: Use entire page for the per-cpu GDT only if
 paravirt-enabled


* Denys Vlasenko <dvlasenk@...hat.com> wrote:

> On 09/28/2015 09:58 AM, Ingo Molnar wrote:
> > 
> > * Denys Vlasenko <dvlasenk@...hat.com> wrote:
> > 
> >> On 09/26/2015 09:50 PM, H. Peter Anvin wrote:
> >>> NAK.  We really should map the GDT read-only on all 64 bit systems,
> >>> since we can't hide the address from SLDT.  Same with the IDT.
> >>
> >> Sorry, I don't understand your point.
> > 
> > So the problem is that right now the SGDT instruction (which is unprivileged) 
> > leaks the real address of the kernel image:
> > 
> >  fomalhaut:~> ./sgdt 
> >  SGDT: ffff88303fd89000 / 007f
> > 
> > that 'ffff88303fd89000' is a kernel address.
> 
> Thank you.
> I do know that SGDT and friends are unprivileged on x86
> and thus they allow userspace (and guest kernels in paravirt)
> learn things they don't need to know.
> 
> I don't see how making GDT page-aligned and page-sized
> changes anything in this regard. SGDT will still work,
> and still leak GDT address.

Well, as I try to explain it in the other part of my mail, doing so enables us to 
remap the GDT to a less security sensitive virtual address that does not leak the 
kernel's randomized address:

> > Your observation in the changelog and your patch:
> > 
> >>>> It is page-sized because of paravirt. [...]
> > 
> > ... conflicts with the intention to mark (remap) the primary GDT address read-only 
> > on native kernels as well.
> > 
> > So what we should do instead is to use the page alignment properly and remap the 
> > GDT to a read-only location, and load that one.
> 
> If we'd have a small GDT (i.e. what my patch does), we still can remap the 
> entire page which contains small GDT, and simply don't care that some other data 
> is also visible through that RO page.

That's generally considered fragile: suppose an attacker has a limited information 
leak that can read absolute addresses with system privilege but he doesn't know 
the kernel's randomized base offset. With a 'partial page' mapping there could be 
function pointers near the GDT, part of the page the GDT happens to be on, that 
leak this information.

(Same goes for crypto keys or other critical information (like canary information, 
salts, etc.) accidentally ending up nearby.)

Arguably it's a bit tenuous, but when playing remapping games it's generally 
considered good to be page aligned and page sized, with zero padding.

> > This would have a couple of advantages:
> > 
> >  - This would give kernel address randomization more teeth on x86.
> > 
> >  - An additional advantage would be that rootkits overwriting the GDT would have 
> >    a bit more work to do.
> > 
> >  - A third advantage would be that for NUMA systems we could 'mirror' the GDT into
> >    node-local memory and load those. This makes GDT load cache-misses a bit less
> >    expensive.
> 
> GDT is per-cpu. Isn't per-cpu memory already NUMA-local?

Indeed it is:

fomalhaut:~> for ((cpu=1; cpu<9; cpu++)); do taskset $cpu ./sgdt ; done
SGDT: ffff88103fa09000 / 007f
SGDT: ffff88103fa29000 / 007f
SGDT: ffff88103fa29000 / 007f
SGDT: ffff88103fa49000 / 007f
SGDT: ffff88103fa49000 / 007f
SGDT: ffff88103fa49000 / 007f
SGDT: ffff88103fa29000 / 007f
SGDT: ffff88103fa69000 / 007f

I confused it with the IDT, which is still global.

This also means that the GDT in itself does not leak kernel addresses at the 
moment, except it leaks the layout of the percpu area.

So my suggestion would be to:

 - make the GDT unconditionally page aligned and sized, then remap it to a
   read-only address unconditionally as well, like we do it for the IDT.

 - make the IDT per CPU as well, for performance reasons.

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/