lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b0fb40f6-8baa-498f-bb88-50113cb80f53@intel.com>
Date: Wed, 1 Oct 2025 13:28:14 -0700
From: Dave Hansen <dave.hansen@...el.com>
To: Brendan Jackman <jackmanb@...gle.com>, Andy Lutomirski <luto@...nel.org>,
 Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
 "Liam R. Howlett" <Liam.Howlett@...cle.com>,
 Suren Baghdasaryan <surenb@...gle.com>, Michal Hocko <mhocko@...e.com>,
 Johannes Weiner <hannes@...xchg.org>, Zi Yan <ziy@...dia.com>,
 Axel Rasmussen <axelrasmussen@...gle.com>, Yuanchu Xie <yuanchu@...gle.com>,
 Roman Gushchin <roman.gushchin@...ux.dev>
Cc: peterz@...radead.org, bp@...en8.de, dave.hansen@...ux.intel.com,
 mingo@...hat.com, tglx@...utronix.de, akpm@...ux-foundation.org,
 david@...hat.com, derkling@...gle.com, junaids@...gle.com,
 linux-kernel@...r.kernel.org, linux-mm@...ck.org, reijiw@...gle.com,
 rientjes@...gle.com, rppt@...nel.org, vbabka@...e.cz, x86@...nel.org,
 yosry.ahmed@...ux.dev
Subject: Re: [PATCH 04/21] x86/mm/asi: set up asi_nonsensitive_pgd

On 9/24/25 07:59, Brendan Jackman wrote:
> Create the initial shared pagetable to hold all the mappings that will
> be shared among ASI domains.
> 
> Mirror the physmap into the ASI pagetables, but with a maximum
> granularity that's guaranteed to allow changing pageblock sensitivity
> without having to allocate pagetables, and with everything as
> non-present.

Could you also talk about what this granularity _actually_ is and why it
has the property of never requiring page table alloc

...
> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> index e98e85cf15f42db669696ba8195d8fc633351b26..7e0471d46767c63ceade479ae0d1bf738f14904a 100644
> --- a/arch/x86/mm/init_64.c
> +++ b/arch/x86/mm/init_64.c
> @@ -7,6 +7,7 @@
>   *  Copyright (C) 2002,2003 Andi Kleen <ak@...e.de>
>   */
>  
> +#include <linux/asi.h>
>  #include <linux/signal.h>
>  #include <linux/sched.h>
>  #include <linux/kernel.h>
> @@ -746,7 +747,8 @@ phys_pgd_init(pgd_t *pgd_page, unsigned long paddr_start, unsigned long paddr_en
>  {
>  	unsigned long vaddr, vaddr_start, vaddr_end, vaddr_next, paddr_last;
>  
> -	*pgd_changed = false;
> +	if (pgd_changed)
> +		*pgd_changed = false;

This 'pgd_changed' hunk isn't mentioned in the changelog.

...
> @@ -797,6 +800,24 @@ __kernel_physical_mapping_init(unsigned long paddr_start,
>  
>  	paddr_last = phys_pgd_init(init_mm.pgd, paddr_start, paddr_end, page_size_mask,
>  				   prot, init, &pgd_changed);
> +
> +	/*
> +	 * Set up ASI's unrestricted physmap. This needs to mapped at minimum 2M
> +	 * size so that regions can be mapped and unmapped at pageblock
> +	 * granularity without requiring allocations.
> +	 */

This took me a minute to wrap my head around.

Here, I think you're trying to convey that:

  1. There's a higher-level design decision that all sensitivity will be
     done at a 2M granularity. A 2MB physical region is either sensitive
     or not.
  2. Because of #1, 1GB mappings are not cool because splitting a 1GB
     mapping into 2MB needs to allocate a page table page.
  3. 4k mappings are OK because they can also have their permissions
     changed at a 2MB granularity. It's just more laborious.

The "minimum 2M size" comment really threw me off because that, to me,
also includes 1G which is a no-no here.

I also can't help but wonder if it would have been easier and more
straightforward to just start this whole exercise at 4k: force all the
ASI tables to be 4k. Then, later, add the 2MB support and tie to
pageblocks on after.


> +	if (asi_nonsensitive_pgd) {
> +		/*
> +		 * Since most memory is expected to end up sensitive, start with
> +		 * everything unmapped in this pagetable.
> +		 */
> +		pgprot_t prot_np = __pgprot(pgprot_val(prot) & ~_PAGE_PRESENT);
> +
> +		VM_BUG_ON((PAGE_SHIFT + pageblock_order) < page_level_shift(PG_LEVEL_2M));
> +		phys_pgd_init(asi_nonsensitive_pgd, paddr_start, paddr_end, 1 << PG_LEVEL_2M,
> +			      prot_np, init, NULL);
> +	}

I'm also kinda wondering what the purpose is of having a whole page
table full of !_PAGE_PRESENT entries. It would be nice to know how this
eventually gets turned into something useful.



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ