[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9502454a-8065-4a65-9644-2b7fe0ec5f7f@intel.com>
Date: Thu, 2 Oct 2025 09:14:20 -0700
From: Dave Hansen <dave.hansen@...el.com>
To: Brendan Jackman <jackmanb@...gle.com>, Andy Lutomirski <luto@...nel.org>,
Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
"Liam R. Howlett" <Liam.Howlett@...cle.com>,
Suren Baghdasaryan <surenb@...gle.com>, Michal Hocko <mhocko@...e.com>,
Johannes Weiner <hannes@...xchg.org>, Zi Yan <ziy@...dia.com>,
Axel Rasmussen <axelrasmussen@...gle.com>, Yuanchu Xie <yuanchu@...gle.com>,
Roman Gushchin <roman.gushchin@...ux.dev>
Cc: peterz@...radead.org, bp@...en8.de, dave.hansen@...ux.intel.com,
mingo@...hat.com, tglx@...utronix.de, akpm@...ux-foundation.org,
david@...hat.com, derkling@...gle.com, junaids@...gle.com,
linux-kernel@...r.kernel.org, linux-mm@...ck.org, reijiw@...gle.com,
rientjes@...gle.com, rppt@...nel.org, vbabka@...e.cz, x86@...nel.org,
yosry.ahmed@...ux.dev
Subject: Re: [PATCH 04/21] x86/mm/asi: set up asi_nonsensitive_pgd
On 10/2/25 07:05, Brendan Jackman wrote:
> On Wed Oct 1, 2025 at 8:28 PM UTC, Dave Hansen wrote:
...>> I also can't help but wonder if it would have been easier and more
>> straightforward to just start this whole exercise at 4k: force all the
>> ASI tables to be 4k. Then, later, add the 2MB support and tie to
>> pageblocks on after.
>
> This would lead to a much smaller patchset, but I think it creates some
> pretty yucky technical debt and complexity of its own. If you're
> imagining a world where we just leave most of the allocator as-is, and
> just inject "map into ASI" or "unmap from ASI" at the right moments...
...
I'm trying to separate out the two problems:
1. Have a set of page tables that never require allocations in order to
map or unmap sensitive data.
2. Manage each pageblock as either all sensitive or all not sensitive
There is a nonzero set of dependencies to make sure that the pageblock
size is compatible with the page table mapping size... unless you just
make the mapping size 4k.
If the mapping size is 4k, the pageblock size can be anything. There's
no dependency to satisfy.
So I'm not saying to make the sensitive/nonsensitive boundary 4k. Just
to make the _mapping_ size 4k. Then, come back later, and move the
mapping size over to 2MB as an optimization.
>>> + if (asi_nonsensitive_pgd) {
>>> + /*
>>> + * Since most memory is expected to end up sensitive, start with
>>> + * everything unmapped in this pagetable.
>>> + */
>>> + pgprot_t prot_np = __pgprot(pgprot_val(prot) & ~_PAGE_PRESENT);
>>> +
>>> + VM_BUG_ON((PAGE_SHIFT + pageblock_order) < page_level_shift(PG_LEVEL_2M));
>>> + phys_pgd_init(asi_nonsensitive_pgd, paddr_start, paddr_end, 1 << PG_LEVEL_2M,
>>> + prot_np, init, NULL);
>>> + }
>>
>> I'm also kinda wondering what the purpose is of having a whole page
>> table full of !_PAGE_PRESENT entries. It would be nice to know how this
>> eventually gets turned into something useful.
>
> If you are thinking of the fact that just clearing P doesn't really do
> anything for Meltdown/L1TF.. yeah that's true! We'll actually need to
> munge the PFN or something too, but here I wanted do just focus on the
> broad strokes of integration without worrying too much about individual
> CPU mitigations. Flippping _PAGE_PRESENT is already supported by
> set_memory.c and IIRC it's good enough for everything newer than
> Skylake.
>
> Other than that, these pages being unmapped is the whole point.. later
> on, the subset of memory that we don't need to protect will get flipped
> to being present. Everything else will trigger a pagefault if touched
> and we'll switch address spaces, do the flushing etc.
>
> Sorry if I'm missing your point here...
What is the point of having a pgd if you can't put it in CR3? If you:
write_cr3(asi_nonsensitive_pgd);
you'll just triple fault because all kernel text is !_PAGE_PRESENT.
The critical point is when 'asi_nonsensitive_pgd' is functional enough
that it can be loaded into CR3 and handle a switch to the normal
init_mm->pgd.
Powered by blists - more mailing lists