lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <diqzjz7azkmf.fsf@ackerleytng-ctop.c.googlers.com>
Date: Wed, 23 Apr 2025 13:30:16 -0700
From: Ackerley Tng <ackerleytng@...gle.com>
To: Yan Zhao <yan.y.zhao@...el.com>, Michael Roth <michael.roth@....com>, kvm@...r.kernel.org, 
	linux-coco@...ts.linux.dev, linux-mm@...ck.org, linux-kernel@...r.kernel.org, 
	jroedel@...e.de, thomas.lendacky@....com, pbonzini@...hat.com, 
	seanjc@...gle.com, vbabka@...e.cz, amit.shah@....com, 
	pratikrajesh.sampat@....com, ashish.kalra@....com, liam.merwick@...cle.com, 
	david@...hat.com, vannapurve@...gle.com, quic_eberman@...cinc.com
Subject: Re: [PATCH 3/5] KVM: gmem: Hold filemap invalidate lock while
 allocating/preparing folios

Yan Zhao <yan.y.zhao@...el.com> writes:

> On Fri, Mar 14, 2025 at 05:20:21PM +0800, Yan Zhao wrote:
>> This patch would cause host deadlock when booting up a TDX VM even if huge page
>> is turned off. I currently reverted this patch. No further debug yet.
> This is because kvm_gmem_populate() takes filemap invalidation lock, and for
> TDX, kvm_gmem_populate() further invokes kvm_gmem_get_pfn(), causing deadlock.
>
> kvm_gmem_populate
>   filemap_invalidate_lock
>   post_populate
>     tdx_gmem_post_populate
>       kvm_tdp_map_page
>        kvm_mmu_do_page_fault
>          kvm_tdp_page_fault
> 	   kvm_tdp_mmu_page_fault
> 	     kvm_mmu_faultin_pfn
> 	       __kvm_mmu_faultin_pfn
> 	         kvm_mmu_faultin_pfn_private
> 		   kvm_gmem_get_pfn
> 		     filemap_invalidate_lock_shared
> 	
> Though, kvm_gmem_populate() is able to take shared filemap invalidation lock,
> (then no deadlock), lockdep would still warn "Possible unsafe locking scenario:
> ...DEADLOCK" due to the recursive shared lock, since commit e918188611f0
> ("locking: More accurate annotations for read_lock()").
>

Thank you for investigating. This should be fixed in the next revision.

>> > @@ -819,12 +827,16 @@ int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot,
>> >  	pgoff_t index = kvm_gmem_get_index(slot, gfn);
>> >  	struct file *file = kvm_gmem_get_file(slot);
>> >  	int max_order_local;
>> > +	struct address_space *mapping;
>> >  	struct folio *folio;
>> >  	int r = 0;
>> >  
>> >  	if (!file)
>> >  		return -EFAULT;
>> >  
>> > +	mapping = file->f_inode->i_mapping;
>> > +	filemap_invalidate_lock_shared(mapping);
>> > +
>> >  	/*
>> >  	 * The caller might pass a NULL 'max_order', but internally this
>> >  	 * function needs to be aware of any order limitations set by
>> > @@ -838,6 +850,7 @@ int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot,
>> >  	folio = __kvm_gmem_get_pfn(file, slot, index, pfn, &max_order_local);
>> >  	if (IS_ERR(folio)) {
>> >  		r = PTR_ERR(folio);
>> > +		filemap_invalidate_unlock_shared(mapping);
>> >  		goto out;
>> >  	}
>> >  
>> > @@ -845,6 +858,7 @@ int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot,
>> >  		r = kvm_gmem_prepare_folio(kvm, file, slot, gfn, folio, max_order_local);
>> >  
>> >  	folio_unlock(folio);
>> > +	filemap_invalidate_unlock_shared(mapping);
>> >  
>> >  	if (!r)
>> >  		*page = folio_file_page(folio, index);
>> > -- 
>> > 2.25.1
>> > 
>> > 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ