[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ebcf2979-45fc-8d41-cc28-ac8da0d24245@intel.com>
Date: Thu, 21 Jul 2022 08:49:31 -0700
From: Dave Hansen <dave.hansen@...el.com>
To: Borislav Petkov <bp@...en8.de>,
"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
Cc: Andy Lutomirski <luto@...nel.org>,
Sean Christopherson <seanjc@...gle.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Joerg Roedel <jroedel@...e.de>,
Ard Biesheuvel <ardb@...nel.org>,
Andi Kleen <ak@...ux.intel.com>,
Kuppuswamy Sathyanarayanan
<sathyanarayanan.kuppuswamy@...ux.intel.com>,
David Rientjes <rientjes@...gle.com>,
Vlastimil Babka <vbabka@...e.cz>,
Tom Lendacky <thomas.lendacky@....com>,
Thomas Gleixner <tglx@...utronix.de>,
Peter Zijlstra <peterz@...radead.org>,
Paolo Bonzini <pbonzini@...hat.com>,
Ingo Molnar <mingo@...hat.com>,
Varad Gautam <varad.gautam@...e.com>,
Dario Faggioli <dfaggioli@...e.com>,
Mike Rapoport <rppt@...nel.org>,
David Hildenbrand <david@...hat.com>,
marcelo.cerri@...onical.com, tim.gardner@...onical.com,
khalid.elmously@...onical.com, philip.cox@...onical.com,
x86@...nel.org, linux-mm@...ck.org, linux-coco@...ts.linux.dev,
linux-efi@...r.kernel.org, linux-kernel@...r.kernel.org,
Mike Rapoport <rppt@...ux.ibm.com>
Subject: Re: [PATCHv7 02/14] mm: Add support for unaccepted memory
On 7/21/22 08:14, Borislav Petkov wrote:
> On Tue, Jun 14, 2022 at 03:02:19PM +0300, Kirill A. Shutemov wrote:
>> On-demand memory accept means latency spikes every time kernel steps
>> onto a new memory block. The spikes will go away once workload data
>> set size gets stabilized or all memory gets accepted.
> What does that mean?
>
> If we're accepting 2M pages and considering referential locality, how
> are those "spikes" even noticeable?
Acceptance is slow and the heavy lifting is done inside the TDX module.
It involves flushing old aliases out of the caches and initializing the
memory integrity metadata for every cacheline. This implementation does
acceptance in 2MB chunks while holding a global lock.
So, those (effective) 2MB clflush+memset's (plus a few thousand cycles
for the hypercall/tdcall transitions) can't happen in parallel. They
are serialized and must wait on each other. If you have a few hundred
CPUs all trying to allocate memory (say, doing the first kernel compile
after a reboot), this is going to be very, very painful for a while.
That said, I think this is the right place to _start_. There is going
to need to be some kind of follow-on solution (likely background
acceptance of some kind). But, even with that solution, *this* code is
still needed to handle the degenerate case where the background accepter
can't keep up with foreground memory needs.
Powered by blists - more mailing lists