[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b33876b6-96bd-afe0-3523-b51b8628616a@amd.com>
Date: Wed, 3 Aug 2022 16:34:53 -0500
From: Tom Lendacky <thomas.lendacky@....com>
To: Dave Hansen <dave.hansen@...el.com>, linux-kernel@...r.kernel.org,
x86@...nel.org
Cc: Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>,
"Kirill A. Shutemov" <kirill@...temov.name>,
"H. Peter Anvin" <hpa@...or.com>,
Michael Roth <michael.roth@....com>,
Joerg Roedel <jroedel@...e.de>,
Andy Lutomirski <luto@...nel.org>,
Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH v1.1 1/2] x86/sev: Use per-CPU PSC structure in prep for
unaccepted memory support
On 8/3/22 16:18, Dave Hansen wrote:
> On 8/3/22 14:03, Tom Lendacky wrote:
>>> This whole iteration does look good to me versus the per-cpu version, so
>>> I say go ahead with doing this for v2 once you wait a bit for any more
>>> feedback.
>>
>> I'm still concerned about the whole spinlock and performance. What if I
>> reduce the number of entries in the PSC structure to, say, 64, which
>> reduces the size of the struct to 520 bytes. Any issue if that is put on
>> the stack, instead? It definitely makes things less complicated and
>> feels like a good compromise on the size vs the number of PSC VMGEXIT
>> requests.
>
> That would be fine too.
Ok.
>
> But, I doubt there will be any real performance issues coming out of
> this. As bad as this MSR thing is, I suspect it's not half as
> disastrous as the global spinlock in Kirill's patches.
>
> Also, private<->shared page conversions are *NOT* common from what I can
> tell. There are a few pages converted at boot, but most host the
> guest<->host communications are through the swiotlb pages which are static.
Generally, that's true. But, e.g., a dma_alloc_coherent() actually doesn't
go through SWIOTLB, but instead allocates the pages and makes them shared,
which results in a page state change. The NVMe driver was calling that API
a lot. In this case, though, the NVMe driver was running in IRQ context
and set_memory_decrypted() could sleep, so an unencrypted DMA memory pool
was created to work around the sleeping issue and reduce the page state
changes. It's just things like that, that make me wary.
Thanks,
Tom
>
> Are there other things that SEV uses this structure for that I'm missing?
Powered by blists - more mailing lists