[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <50CF4ACD.80701@linux.intel.com>
Date: Mon, 17 Dec 2012 08:39:41 -0800
From: "H. Peter Anvin" <hpa@...ux.intel.com>
To: Jan Beulich <JBeulich@...e.com>
CC: Linus Torvalds <torvalds@...ux-foundation.org>,
Arnd Bergmann <arnd@...db.de>, Ingo Molnar <mingo@...e.hu>,
Michael Kerrisk <mtk.manpages@...il.com>,
Guennadi Liakhovetski <g.liakhovetski@....de>,
Matt Fleming <matt.fleming@...el.com>,
Thomas Gleixner <tglx@...utronix.de>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Dave Jones <davej@...hat.com>,
David Howells <dhowells@...hat.com>,
Grant Likely <grant.likely@...retlab.ca>,
Markus Trippelsdorf <markus@...ppelsdorf.de>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [GIT PULL] x86/uapi for 3.8
On 12/17/2012 08:00 AM, Jan Beulich wrote:
>>>> On 17.12.12 at 16:44, Linus Torvalds <torvalds@...ux-foundation.org> wrote:
>> On Mon, Dec 17, 2012 at 1:04 AM, Jan Beulich <JBeulich@...e.com> wrote:
>>>
>>> How about this being caused by using the same lower level
>>> page table entries that swapper_pg_dir uses, namely including
>>> the _PAGE_GLOBAL bits? efi_call_virt_{pre,epi}log() only write
>>> CR3 (see 185034e72d591f9465e5e18f937ed642e7ea0070), but
>>> would need to also flip CR4.PGE afaict.
>>
>> Now *this* is the kind of issue that I could easily see causing major
>> corruption, but be subtle enough to not happen reliably. Coming back
>> from the EFI calls (or going into them) with stale TLB contents due to
>> global pages could explain things.
>>
>> Good thinking. That efi call code should use flush_tlb_kernel() (or
>> __flush_tlb_global() if it wants to avoid any paravirtualization
>> stuff) if it has global pages in different places from the normal
>> kernel map. Does it really have that?
>
> I don't see it having such. But I also don't think flush_tlb_kernel()
> is the right mechanism here. I'd rather suggest clearing CR4.PGE in
> the "prelog", an restore it in the epilog. Para-virtual environments
> shouldn't be directly interfacing with EFI runtime code anyway.
>
Right, I think you nailed this one. This patch copies PTEs from the
kernel PTEs and thus they will have the global bit set. It obviously
makes no sense to *copy* PTEs from the kernel and yet leaving the global
bit set, which means there are two ways of fixing it: either sharing
page tables and use the cr4.pge off/on trick that Jan mentioned -- this
would also be my preference -- and the other is to copy the PTEs but
strip the global bit, which has the advantage that the actual kernel
mappings will survive.
One idea in this is to change ioremap() on x86-64 to instead of
allocating address space dynamically to always use the PAGE_OFFSET
mapping address, even for I/O devices. Then the trampoline page table
can simply include two sets of pointers into the kernel page tables --
with, again, the caveat that a global page flush is absolutely mandatory.
Linus, Ingo, do you have any preferences here?
-hpa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists