linux-kernel - Re: [RFC PATCH 1/4] x86/mm/cpa: restore global bit when page is present

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3516813d-5d2a-821a-81e8-1ed78ad63561@intel.com>
Date:   Thu, 11 Aug 2022 20:28:37 +0800
From:   Aaron Lu <aaron.lu@...el.com>
To:     Hyeonggon Yoo <42.hyeyoo@...il.com>
CC:     "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "Hansen, Dave" <dave.hansen@...el.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>,
        "song@...nel.org" <song@...nel.org>
Subject: Re: [RFC PATCH 1/4] x86/mm/cpa: restore global bit when page is
 present

On 8/11/2022 7:30 PM, Hyeonggon Yoo wrote:
> On Thu, Aug 11, 2022 at 08:16:08AM +0000, Lu, Aaron wrote:
>> On Thu, 2022-08-11 at 05:21 +0000, Hyeonggon Yoo wrote:
>>> On Mon, Aug 08, 2022 at 10:56:46PM +0800, Aaron Lu wrote:
>>>> For configs that don't have PTI enabled or cpus that don't need
>>>> meltdown mitigation, current kernel can lose GLOBAL bit after a page
>>>> goes through a cycle of present -> not present -> present.
>>>>
>>>> It happened like this(__vunmap() does this in vm_remove_mappings()):
>>>> original page protection: 0x8000000000000163 (NX/G/D/A/RW/P)
>>>> set_memory_np(page, 1):   0x8000000000000062 (NX/D/A/RW) lose G and P
>>>> set_memory_p(pagem 1):    0x8000000000000063 (NX/D/A/RW/P) restored P
>>>>
>>>> In the end, this page's protection no longer has Global bit set and this
>>>> would create problem for this merge small mapping feature.
>>>>
>>>> For this reason, restore Global bit for systems that do not have PTI
>>>> enabled if page is present.
>>>>
>>>> (pgprot_clear_protnone_bits() deserves a better name if this patch is
>>>> acceptible but first, I would like to get some feedback if this is the
>>>> right way to solve this so I didn't bother with the name yet)
>>>>
>>>> Signed-off-by: Aaron Lu <aaron.lu@...el.com>
>>>> ---
>>>>  arch/x86/mm/pat/set_memory.c | 2 ++
>>>>  1 file changed, 2 insertions(+)
>>>>
>>>> diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
>>>> index 1abd5438f126..33657a54670a 100644
>>>> --- a/arch/x86/mm/pat/set_memory.c
>>>> +++ b/arch/x86/mm/pat/set_memory.c
>>>> @@ -758,6 +758,8 @@ static pgprot_t pgprot_clear_protnone_bits(pgprot_t prot)
>>>>  	 */
>>>>  	if (!(pgprot_val(prot) & _PAGE_PRESENT))
>>>>  		pgprot_val(prot) &= ~_PAGE_GLOBAL;
>>>> +	else
>>>> +		pgprot_val(prot) |= _PAGE_GLOBAL & __default_kernel_pte_mask;
>>>>  
>>>>  	return prot;
>>>>  }
>>>
>>> IIUC It makes it unable to set _PAGE_GLOBL when PTI is on.
>>>
>>
>> Yes. Is this a problem?
>> I think that is the intended behaviour when PTI is on: not to enable
>> Gloabl bit on kernel mappings.
> 
> Please note that I'm not expert on PTI.
> 
> but AFAIK with PTI, at least everything (kernel part) mapped to user page table is
> mapped as global when PGE is supported.
> 
> Not sure "Global bit is never used for kernel part when PTI is enabled"
> is true.
>
> Also, commit d1440b23c922d ("x86/mm: Factor out pageattr _PAGE_GLOBAL setting") that introduced
> pgprot_clear_protnone_bits() says:
> 	
> 	This unconditional setting of _PAGE_GLOBAL is a problem when we have
> 	PTI and non-PTI and we want some areas to have _PAGE_GLOBAL and some
> 	not.
> 
> 	This updated version of the code says:
> 	1. Clear _PAGE_GLOBAL when !_PAGE_PRESENT
> 	2. Never set _PAGE_GLOBAL implicitly
> 	3. Allow _PAGE_GLOBAL to be in cpa.set_mask
> 	4. Allow _PAGE_GLOBAL to be inherited from previous PTE
>

Thanks for these info, I'll need to take a closer look at PTI.

>>> Maybe it would be less intrusive to make
>>> set_direct_map_default_noflush() replace protection bits
>>> with PAGE_KENREL as it's only called for direct map, and the function
>>> is to reset permission to default:
>>>
>>> diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
>>> index 1abd5438f126..0dd4433c1382 100644
>>> --- a/arch/x86/mm/pat/set_memory.c
>>> +++ b/arch/x86/mm/pat/set_memory.c
>>> @@ -2250,7 +2250,16 @@ int set_direct_map_invalid_noflush(struct page *page)
>>>
>>>  int set_direct_map_default_noflush(struct page *page)
>>>  {
>>> -       return __set_pages_p(page, 1);
>>> +       unsigned long tempaddr = (unsigned long) page_address(page);
>>> +       struct cpa_data cpa = {
>>> +                       .vaddr = &tempaddr,
>>> +                       .pgd = NULL,
>>> +                       .numpages = 1,
>>> +                       .mask_set = PAGE_KERNEL,
>>> +                       .mask_clr = __pgprot(~0),
> 
> Nah, this sets _PAGE_ENC unconditionally, which should be evaluated.
> Maybe less intrusive way would be:
> 		       .mask_set = __pgprot(_PAGE_PRESENT |
> 					   (_PAGE_GLOBAL & __kernel_default_pte_mask)),
>                        .mask_clr = __pgprot(0),
> 
>>> +                       .flags = 0};
>>> +
>>> +       return __change_page_attr_set_clr(&cpa, 0);
>>>  }
>>
>> Looks reasonable to me and it is indeed less intrusive. I'm only
>> concerned there might be other paths that also go through present ->
>> not present -> present and this change can not cover them.
>>
> 
> AFAIK other paths going through present->not present->present (using CPA)
> is only when DEBUG_PAGEALLOC is used.
> 
> Do we care direct map fragmentation when using DEBUG_PAGEALLOC?
> 

No, direct mapping does not use large page mapping when DEBUG_PAGEALLOC.

>>>
>>> set_direct_map_{invalid,default}_noflush() is the exact reason
>>> why direct map become split after vmalloc/vfree with special
>>> permissions.
>>
>> Yes I agree, because it can lose G bit after the whole cycle when PTI
>> is not on. When PTI is on, there is no such problem because G bit is
>> not there initially.
>>
>> Thanks,
>> Aaron
>