linux-kernel - Re: [PATCH v2] KVM: Move gfn_to_pfn_cache invalidation to invalidate_range

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZsUOR2Sf2A07U6ox@google.com>
Date: Tue, 20 Aug 2024 14:44:39 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: David Woodhouse <dwmw2@...radead.org>
Cc: Paolo Bonzini <pbonzini@...hat.com>, Mushahid Hussain <hmushi@...zon.co.uk>, 
	Vitaly Kuznetsov <vkuznets@...hat.com>, Wanpeng Li <wanpengli@...cent.com>, 
	Jim Mattson <jmattson@...gle.com>, Joerg Roedel <joro@...tes.org>, kvm@...r.kernel.org, 
	linux-kernel@...r.kernel.org, Mingwei Zhang <mizhang@...gle.com>, 
	Maxim Levitsky <mlevitsk@...hat.com>
Subject: Re: [PATCH v2] KVM: Move gfn_to_pfn_cache invalidation to
 invalidate_range_end hook

On Tue, Aug 20, 2024, David Woodhouse wrote:
> From: David Woodhouse <dwmw@...zon.co.uk>
> 
> The existing retry loop in hva_to_pfn_retry() is extremely pessimistic.
> If there are any concurrent invalidations running, it's effectively just
> a complex busy wait loop because its local mmu_notifier_retry_cache()
> function will always return true.
> 
> Since multiple invalidations can be running in parallel, this can result
> in a situation where hva_to_pfn_retry() just backs off and keep retrying
> for ever, not making any progress.
> 
> Solve this by being a bit more selective about when to retry.
> 
> Introduce a separate 'needs invalidation' flag to the GPC, which allows
> it to be marked invalid even while hva_to_pfn_retry() has dropped the
> lock to map the newly-found PFN. This allows the invalidation to moved
> to the range_end hook, and the retry loop only occurs for a given GPC if
> its particular uHVA is affected.
> 
> However, the contract for invalidate_range_{start,end} is not like a
> simple TLB; the pages may have been freed by the time the end hook is
> called. A "device" may not create new mappings after the _start_ hook is
> called. To meet this requirement, hva_to_pfn_retry() now waits until no
> invalidations are currently running which may affect its uHVA, before
> finally setting the ->valid flag and returning.

Please split this into 3 patches:

 1. Add range-based GPC retry
 2. Add the wait mechanism.
 3. Add the needs_invalidation logic.

#1 and #2 make sense to me, but I'm struggling to understanding why #3 is needed.
KVM absolutely must not touch the memory after .invalidate_range_start(), so I
don't see what is gained by deferring invalidation to invalidate_range_end().