lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <570b2f04-0c46-4a40-9b59-b9db1b5b6185@redhat.com>
Date: Thu, 13 Feb 2025 21:33:58 +0100
From: David Hildenbrand <david@...hat.com>
To: Claudio Imbrenda <imbrenda@...ux.ibm.com>, kvm@...r.kernel.org
Cc: linux-kernel@...r.kernel.org, linux-s390@...r.kernel.org,
 frankja@...ux.ibm.com, borntraeger@...ibm.com, nrb@...ux.ibm.com,
 seiden@...ux.ibm.com, nsg@...ux.ibm.com, schlameuss@...ux.ibm.com,
 hca@...ux.ibm.com
Subject: Re: [PATCH v1 2/2] KVM: s390: pv: fix race when making a page secure

On 13.02.25 21:16, David Hildenbrand wrote:
> On 13.02.25 21:07, Claudio Imbrenda wrote:
>> Holding the pte lock for the page that is being converted to secure is
>> needed to avoid races. A previous commit removed the locking, which
>> caused issues. Fix by locking the pte again.
>>
>> Fixes: 5cbe24350b7d ("KVM: s390: move pv gmap functions into kvm")
> 
> If you found this because of my report about the changed locking,
> consider adding a Suggested-by / Reported-y.
> 
>> Signed-off-by: Claudio Imbrenda <imbrenda@...ux.ibm.com>
>> ---
>>    arch/s390/include/asm/uv.h |  2 +-
>>    arch/s390/kernel/uv.c      | 19 +++++++++++++++++--
>>    arch/s390/kvm/gmap.c       | 12 ++++++++----
>>    3 files changed, 26 insertions(+), 7 deletions(-)
>>
>> diff --git a/arch/s390/include/asm/uv.h b/arch/s390/include/asm/uv.h
>> index b11f5b6d0bd1..46fb0ef6f984 100644
>> --- a/arch/s390/include/asm/uv.h
>> +++ b/arch/s390/include/asm/uv.h
>> @@ -631,7 +631,7 @@ int uv_pin_shared(unsigned long paddr);
>>    int uv_destroy_folio(struct folio *folio);
>>    int uv_destroy_pte(pte_t pte);
>>    int uv_convert_from_secure_pte(pte_t pte);
>> -int make_folio_secure(struct folio *folio, struct uv_cb_header *uvcb);
>> +int make_hva_secure(struct mm_struct *mm, unsigned long hva, struct uv_cb_header *uvcb);
>>    int uv_convert_from_secure(unsigned long paddr);
>>    int uv_convert_from_secure_folio(struct folio *folio);
>>    
>> diff --git a/arch/s390/kernel/uv.c b/arch/s390/kernel/uv.c
>> index 9f05df2da2f7..de3c092da7b9 100644
>> --- a/arch/s390/kernel/uv.c
>> +++ b/arch/s390/kernel/uv.c
>> @@ -245,7 +245,7 @@ static int expected_folio_refs(struct folio *folio)
>>     * Context: The caller must hold exactly one extra reference on the folio
>>     *          (it's the same logic as split_folio())
>>     */
>> -int make_folio_secure(struct folio *folio, struct uv_cb_header *uvcb)
>> +static int __make_folio_secure(struct folio *folio, unsigned long hva, struct uv_cb_header *uvcb)
>>    {
>>    	int expected, cc = 0;
>>    
>> @@ -277,7 +277,22 @@ int make_folio_secure(struct folio *folio, struct uv_cb_header *uvcb)
>>    		return -EAGAIN;
>>    	return uvcb->rc == 0x10a ? -ENXIO : -EINVAL;
>>    }
>> -EXPORT_SYMBOL_GPL(make_folio_secure);
>> +
>> +int make_hva_secure(struct mm_struct *mm, unsigned long hva, struct uv_cb_header *uvcb)
>> +{
>> +	spinlock_t *ptelock;
>> +	pte_t *ptep;
>> +	int rc;
>> +
>> +	ptep = get_locked_pte(mm, hva, &ptelock);
>> +	if (!ptep)
>> +		return -ENXIO;
>> +	rc = __make_folio_secure(page_folio(pte_page(*ptep)), hva, uvcb);
>> +	pte_unmap_unlock(ptep, ptelock);
>> +
>> +	return rc;
>> +}
>> +EXPORT_SYMBOL_GPL(make_hva_secure);
>>    
>>    /*
>>     * To be called with the folio locked or with an extra reference! This will
>> diff --git a/arch/s390/kvm/gmap.c b/arch/s390/kvm/gmap.c
>> index fc4d490d25a2..e56c0ab5fec7 100644
>> --- a/arch/s390/kvm/gmap.c
>> +++ b/arch/s390/kvm/gmap.c
>> @@ -55,7 +55,7 @@ static bool should_export_before_import(struct uv_cb_header *uvcb, struct mm_str
>>    	return atomic_read(&mm->context.protected_count) > 1;
>>    }
>>    
>> -static int __gmap_make_secure(struct gmap *gmap, struct page *page, void *uvcb)
>> +static int __gmap_make_secure(struct gmap *gmap, struct page *page, unsigned long hva, void *uvcb)
>>    {
>>    	struct folio *folio = page_folio(page);
>>    	int rc;
>> @@ -83,7 +83,7 @@ static int __gmap_make_secure(struct gmap *gmap, struct page *page, void *uvcb)
>>    		return -EAGAIN;
>>    	if (should_export_before_import(uvcb, gmap->mm))
>>    		uv_convert_from_secure(folio_to_phys(folio));
>> -	rc = make_folio_secure(folio, uvcb);
>> +	rc = make_hva_secure(gmap->mm, hva, uvcb);
>>    	folio_unlock(folio);
>>    
>>    	/*
>> @@ -120,6 +120,7 @@ static int __gmap_make_secure(struct gmap *gmap, struct page *page, void *uvcb)
>>    int gmap_make_secure(struct gmap *gmap, unsigned long gaddr, void *uvcb)
>>    {
>>    	struct kvm *kvm = gmap->private;
>> +	unsigned long vmaddr;
>>    	struct page *page;
>>    	int rc = 0;
>>    
>> @@ -127,8 +128,11 @@ int gmap_make_secure(struct gmap *gmap, unsigned long gaddr, void *uvcb)
>>    
>>    	page = gfn_to_page(kvm, gpa_to_gfn(gaddr));
>>    	mmap_read_lock(gmap->mm);
>> -	if (page)
>> -		rc = __gmap_make_secure(gmap, page, uvcb);
>> +	vmaddr = gfn_to_hva(gmap->private, gpa_to_gfn(gaddr));
>> +	if (kvm_is_error_hva(vmaddr))
>> +		rc = -ENXIO;
>> +	if (!rc && page)
>> +		rc = __gmap_make_secure(gmap, page, vmaddr, uvcb);
>>    	kvm_release_page_clean(page);
>>    	mmap_read_unlock(gmap->mm);
>>    
> 
> You effectively make the code more complicated and inefficient than
> before. Now you effectively walk the page table twice in the common
> small-folio case ...
> 
> Can we just go back to the old handling that we had before here?

I'll note that there is still the possibility for a different race: 
nothing guarantees that the page you looked up using gfn_to_hva() will 
still be mapped when you perform the get_locked_pte(). Not sure what 
would happen if we would have a different page mapped.

You could re-verify it is still there, but then, doing two page table 
walks is still more than required in the common case where we can just 
perform the conversion.

-- 
Cheers,

David / dhildenb


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ