[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d4be0149-1a28-24e8-7821-e8c96f98a7ac@oracle.com>
Date: Tue, 17 Nov 2020 09:42:41 +0100
From: Alexandre Chartre <alexandre.chartre@...cle.com>
To: Andy Lutomirski <luto@...nel.org>
Cc: Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
"H. Peter Anvin" <hpa@...or.com>, X86 ML <x86@...nel.org>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Peter Zijlstra <peterz@...radead.org>,
LKML <linux-kernel@...r.kernel.org>,
Tom Lendacky <thomas.lendacky@....com>,
Joerg Roedel <jroedel@...e.de>,
Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
jan.setjeeilers@...cle.com, Junaid Shahid <junaids@...gle.com>,
oweisse@...gle.com, Mike Rapoport <rppt@...ux.vnet.ibm.com>,
Alexander Graf <graf@...zon.de>, mgross@...ux.intel.com,
kuzuno@...il.com
Subject: Re: [RFC][PATCH v2 11/21] x86/pti: Extend PTI user mappings
On 11/17/20 12:06 AM, Andy Lutomirski wrote:
> On Mon, Nov 16, 2020 at 12:18 PM Alexandre Chartre
> <alexandre.chartre@...cle.com> wrote:
>>
>>
>> On 11/16/20 8:48 PM, Andy Lutomirski wrote:
>>> On Mon, Nov 16, 2020 at 6:49 AM Alexandre Chartre
>>> <alexandre.chartre@...cle.com> wrote:
>>>>
>>>> Extend PTI user mappings so that more kernel entry code can be executed
>>>> with the user page-table. To do so, we need to map syscall and interrupt
>>>> entry code, per cpu offsets (__per_cpu_offset, which is used some in
>>>> entry code), the stack canary, and the PTI stack (which is defined per
>>>> task).
>>>
>>> Does anything unmap the PTI stack? Mapping is easy, and unmapping
>>> could be a pretty big mess.
>>>
>>
>> No, there's no unmap. The mapping exists as long as the task page-table
>> does (i.e. as long as the task mm exits). I assume that the task stack
>> and mm are freed at the same time but that's not something I have checked.
>>
>
> Nope. A multi-threaded mm will free task stacks when the task exits,
> but the mm may outlive the individual tasks. Additionally, if you
> allocate page tables as part of mapping PTI stacks, you need to make
> sure the pagetables are freed.
So I think I just need to unmap the PTI stack from the user page-table
when the task exits. Everything else is handled because the kernel and
PTI stack are allocated in a single chunk (referenced by task->stack).
> Finally, you need to make sure that
> the PTI stacks have appropriate guard pages -- just doubling the
> allocation is not safe enough.
The PTI stack does have guard pages because it maps only a part of the task
stack into the user page-table, so pages around the PTI stack are not mapped
into the user-pagetable (the page below is the task stack guard, and the page
above is part of the kernel-only stack so it's never mapped into the user
page-table).
+ * +-------------+
+ * | | ^ ^
+ * | kernel-only | | KERNEL_STACK_SIZE |
+ * | stack | | |
+ * | | V |
+ * +-------------+ <- top of kernel stack | THREAD_SIZE
+ * | | ^ |
+ * | kernel and | | KERNEL_STACK_SIZE |
+ * | PTI stack | | |
+ * | | V v
+ * +-------------+ <- top of stack
> My intuition is that this is going to be far more complexity than is justified.
Sounds like only the PTI stack unmap is missing, which is hopefully not
that bad. I will check that.
alex.
Powered by blists - more mailing lists