[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6f5f917e-cded-66ca-2549-f3d51dff1595@amd.com>
Date: Wed, 14 Feb 2018 09:09:05 -0600
From: Tom Lendacky <thomas.lendacky@....com>
To: "Kirill A. Shutemov" <kirill@...temov.name>,
Kai Huang <kai.huang@...ux.intel.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
Ingo Molnar <mingo@...hat.com>, x86@...nel.org,
Thomas Gleixner <tglx@...utronix.de>,
"H. Peter Anvin" <hpa@...or.com>,
Dave Hansen <dave.hansen@...el.com>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] x86/mm: Decouple dynamic __PHYSICAL_MASK from AMD SME
On 2/14/2018 3:02 AM, Kirill A. Shutemov wrote:
> On Wed, Feb 14, 2018 at 08:30:20PM +1300, Kai Huang wrote:
>> On Tue, 2018-02-13 at 22:57 -0600, Tom Lendacky wrote:
>>> On 2/13/2018 10:21 PM, Kirill A. Shutemov wrote:
>>>> On Tue, Feb 13, 2018 at 10:10:22PM -0600, Tom Lendacky wrote:
>>>>> On 2/8/2018 6:55 AM, Kirill A. Shutemov wrote:
>>>>>> AMD SME claims one bit from physical address to indicate
>>>>>> whether the
>>>>>> page is encrypted or not. To achieve that we clear out the bit
>>>>>> from
>>>>>> __PHYSICAL_MASK.
>>>>>
>>>>> I was actually working on a suggestion by Linus to use one of the
>>>>> software
>>>>> page table bits to indicate encryption and translate that to the
>>>>> hardware
>>>>> bit when writing the actual page table entry. With that,
>>>>> __PHYSICAL_MASK
>>>>> would go back to its original definition.
>>>>
>>>> But you would need to mask it on reading of pfn from page table
>>>> entry,
>>>> right? I expect it to have more overhead than this one.
>>>
>>> When reading back an entry it would translate the hardware bit
>>> position
>>> back to the software bit position. The suggestion for changing it
>>> was
>>> to make _PAGE_ENC a constant and not tied to the sme_me_mask.
>
> But is it really constant? I thought it's enumerated at boot-time.
> Can we step onto a problem for future AMD CPUs?
_PAGE_ENC would be constant and it would be translated to the actual bit
that was enumerated at boot-time when writing the page table entry and
translated back to _PAGE_ENC when reading the page table entry.
>
> In case of MKTME the bits we need to clear are not constant. Depends on
> CPU and BIOS settings.
>
> By making _PAGE_ENC constant we would effectively lower maximum physical
> address space the kernel can handle, regardless if the system has SME
> enabled. I can imagine some people wouldn't be happy about this.
I don't see how this would lower the maximum physical address space the
kernel could handle. Bit 57 is part of the reserved page table flag
bits and if SME is not enabled the hardware bits are never used.
What I do see as a problem is a kernel built with support for SME, and
therefore _PAGE_ENC is not zero, but SME has not been enabled by the BIOS
or mem_encrypt=off is specified. In this case you can never be certain
that the translation from software bit to hardware bit and back is
correct. Take for example, pmd_bad(). Here, _KERNPG_TABLE would have a
non-zero _PAGE_ENC or'd into it. When written to a page table entry when
SME is not enabled/active, the actual hardware encryption bit would not be
set. When reading back the value, since the hardware encryption bit is
not set, the translation to set _PAGE_ENC bit won't be done and the
comparison to _KERNPG_TABLE would fail. Of course we could just eliminate
_PAGE_ENC from the comparison...
>
> And I think it would collide with 5-level paging.
Does 5-level paging remove bit 57 from the reserved flags?
>
> I would leave it as variable for now and look on this later once we would
> have infrastructure to patch constants in kernel text.
If the MK-TME support is going to use the same approach to include the
mask/bits real time in _PAGE_ENC then maybe it would be best to get that
in first and then look to see if something could be done along the lines
of what Linus suggests or with the patchable constants.
Thanks,
Tom
>
Powered by blists - more mailing lists