lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 13 Oct 2023 12:45:20 -0500
From:   Tom Lendacky <thomas.lendacky@....com>
To:     "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
        Michael Roth <michael.roth@....com>
Cc:     Borislav Petkov <bp@...en8.de>, Andy Lutomirski <luto@...nel.org>,
        Dave Hansen <dave.hansen@...el.com>,
        Sean Christopherson <seanjc@...gle.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Joerg Roedel <jroedel@...e.de>,
        Ard Biesheuvel <ardb@...nel.org>,
        Andi Kleen <ak@...ux.intel.com>,
        Kuppuswamy Sathyanarayanan 
        <sathyanarayanan.kuppuswamy@...ux.intel.com>,
        David Rientjes <rientjes@...gle.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        Thomas Gleixner <tglx@...utronix.de>,
        Peter Zijlstra <peterz@...radead.org>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Ingo Molnar <mingo@...hat.com>,
        Dario Faggioli <dfaggioli@...e.com>,
        Mike Rapoport <rppt@...nel.org>,
        David Hildenbrand <david@...hat.com>,
        Mel Gorman <mgorman@...hsingularity.net>,
        marcelo.cerri@...onical.com, tim.gardner@...onical.com,
        khalid.elmously@...onical.com, philip.cox@...onical.com,
        aarcange@...hat.com, peterx@...hat.com, x86@...nel.org,
        linux-mm@...ck.org, linux-coco@...ts.linux.dev,
        linux-efi@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCHv14 5/9] efi: Add unaccepted memory support

On 10/13/23 11:22, Kirill A. Shutemov wrote:
> On Fri, Oct 13, 2023 at 03:33:58PM +0300, Kirill A. Shutemov wrote:
>>> While testing SNP guests running today's tip/master (ef19bc9dddc3) I ran
>>> into what seems to be fairly significant lock contention due to the
>>> unaccepted_memory_lock spinlock above, which results in a constant stream
>>> of soft-lockups until the workload gets all its memory accepted/faulted
>>> in if the guest has around 16+ vCPUs.
>>>
>>> I've included the guest dmesg traces I was seeing below.
>>>
>>> In this case I was running a 32 vCPU guest with 200GB of memory running on
>>> a 256 thread EPYC (Milan) system, and can trigger the above situation fairly
>>> reliably by running the following workload in a freshly-booted guests:
>>>
>>>    stress --vm 32 --vm-bytes 5G --vm-keep
>>>
>>> Scaling up the number of stress threads and vCPUs should make it easier
>>> to reproduce.
>>>
>>> Other than unresponsiveness/lockup messages until the memory is accepted,
>>> the guest seems to continue running fine, but for large guests where
>>> unaccepted memory is more likely to be useful, it seems like it could be
>>> an issue, especially when consider 100+ vCPU guests.
>>
>> Okay, sorry for delay. It took time to reproduce it with TDX.
>>
>> I will look what can be done.
> 
> Could you check if the patch below helps?
> 
> diff --git a/drivers/firmware/efi/unaccepted_memory.c b/drivers/firmware/efi/unaccepted_memory.c
> index 853f7dc3c21d..591da3f368fa 100644
> --- a/drivers/firmware/efi/unaccepted_memory.c
> +++ b/drivers/firmware/efi/unaccepted_memory.c
> @@ -8,6 +8,14 @@
>   /* Protects unaccepted memory bitmap */
>   static DEFINE_SPINLOCK(unaccepted_memory_lock);
>   
> +struct accept_range {
> +	struct list_head list;
> +	unsigned long start;
> +	unsigned long end;
> +};
> +
> +static LIST_HEAD(accepting_list);
> +
>   /*
>    * accept_memory() -- Consult bitmap and accept the memory if needed.
>    *
> @@ -24,6 +32,7 @@ void accept_memory(phys_addr_t start, phys_addr_t end)
>   {
>   	struct efi_unaccepted_memory *unaccepted;
>   	unsigned long range_start, range_end;
> +	struct accept_range range, *entry;
>   	unsigned long flags;
>   	u64 unit_size;
>   
> @@ -80,7 +89,25 @@ void accept_memory(phys_addr_t start, phys_addr_t end)
>   
>   	range_start = start / unit_size;
>   
> +	range.start = start;
> +	range.end = end;
> +retry:
>   	spin_lock_irqsave(&unaccepted_memory_lock, flags);
> +
> +	list_for_each_entry(entry, &accepting_list, list) {
> +		if (entry->end < start)
> +			continue;
> +		if (entry->start > end)

Should this be a >= check since start and end are page aligned values?

> +			continue;
> +		spin_unlock_irqrestore(&unaccepted_memory_lock, flags);
> +
> +		/* Somebody else accepting the range */
> +		cpu_relax();
> +		goto retry;

Could you set some kind of flag here so that ...

> +	}
> +

... once you get here, that means that area was accepted and removed from 
the list, so I think you could just drop the lock and exit now, right? 
Because at that point the bitmap will have been updated and you wouldn't 
be accepting any memory anyway?

Thanks,
Tom

> +	list_add(&range.list, &accepting_list);
> +
>   	for_each_set_bitrange_from(range_start, range_end, unaccepted->bitmap,
>   				   DIV_ROUND_UP(end, unit_size)) {
>   		unsigned long phys_start, phys_end;
> @@ -89,9 +116,15 @@ void accept_memory(phys_addr_t start, phys_addr_t end)
>   		phys_start = range_start * unit_size + unaccepted->phys_base;
>   		phys_end = range_end * unit_size + unaccepted->phys_base;
>   
> +		spin_unlock_irqrestore(&unaccepted_memory_lock, flags);
> +
>   		arch_accept_memory(phys_start, phys_end);
> +
> +		spin_lock_irqsave(&unaccepted_memory_lock, flags);
>   		bitmap_clear(unaccepted->bitmap, range_start, len);
>   	}
> +
> +	list_del(&range.list);
>   	spin_unlock_irqrestore(&unaccepted_memory_lock, flags);
>   }
>   

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ