linux-kernel - Re: [PATCH v1.1 1/2] x86/sev: Use per-CPU PSC structure in prep for unaccepted memory support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <b33876b6-96bd-afe0-3523-b51b8628616a@amd.com>
Date:   Wed, 3 Aug 2022 16:34:53 -0500
From:   Tom Lendacky <thomas.lendacky@....com>
To:     Dave Hansen <dave.hansen@...el.com>, linux-kernel@...r.kernel.org,
        x86@...nel.org
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        "Kirill A. Shutemov" <kirill@...temov.name>,
        "H. Peter Anvin" <hpa@...or.com>,
        Michael Roth <michael.roth@....com>,
        Joerg Roedel <jroedel@...e.de>,
        Andy Lutomirski <luto@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH v1.1 1/2] x86/sev: Use per-CPU PSC structure in prep for
 unaccepted memory support

On 8/3/22 16:18, Dave Hansen wrote:
> On 8/3/22 14:03, Tom Lendacky wrote:
>>> This whole iteration does look good to me versus the per-cpu version, so
>>> I say go ahead with doing this for v2 once you wait a bit for any more
>>> feedback.
>>
>> I'm still concerned about the whole spinlock and performance. What if I
>> reduce the number of entries in the PSC structure to, say, 64, which
>> reduces the size of the struct to 520 bytes. Any issue if that is put on
>> the stack, instead? It definitely makes things less complicated and
>> feels like a good compromise on the size vs the number of PSC VMGEXIT
>> requests.
> 
> That would be fine too.

Ok.

> 
> But, I doubt there will be any real performance issues coming out of
> this.  As bad as this MSR thing is, I suspect it's not half as
> disastrous as the global spinlock in Kirill's patches.
> 
> Also, private<->shared page conversions are *NOT* common from what I can
> tell.  There are a few pages converted at boot, but most host the
> guest<->host communications are through the swiotlb pages which are static.

Generally, that's true. But, e.g., a dma_alloc_coherent() actually doesn't 
go through SWIOTLB, but instead allocates the pages and makes them shared, 
which results in a page state change. The NVMe driver was calling that API 
a lot. In this case, though, the NVMe driver was running in IRQ context 
and set_memory_decrypted() could sleep, so an unencrypted DMA memory pool 
was created to work around the sleeping issue and reduce the page state 
changes. It's just things like that, that make me wary.

Thanks,
Tom

> 
> Are there other things that SEV uses this structure for that I'm missing?