[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <345d89c1-4f31-6b49-2cd4-a0696210fa7c@loongson.cn>
Date: Sat, 3 Aug 2024 11:02:07 +0800
From: maobibo <maobibo@...ngson.cn>
To: Sean Christopherson <seanjc@...gle.com>
Cc: Paolo Bonzini <pbonzini@...hat.com>, Marc Zyngier <maz@...nel.org>,
Oliver Upton <oliver.upton@...ux.dev>, Tianrui Zhao
<zhaotianrui@...ngson.cn>, Huacai Chen <chenhuacai@...nel.org>,
Michael Ellerman <mpe@...erman.id.au>, Anup Patel <anup@...infault.org>,
Paul Walmsley <paul.walmsley@...ive.com>, Palmer Dabbelt
<palmer@...belt.com>, Albert Ou <aou@...s.berkeley.edu>,
Christian Borntraeger <borntraeger@...ux.ibm.com>,
Janosch Frank <frankja@...ux.ibm.com>,
Claudio Imbrenda <imbrenda@...ux.ibm.com>, kvm@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org, kvmarm@...ts.linux.dev,
loongarch@...ts.linux.dev, linux-mips@...r.kernel.org,
linuxppc-dev@...ts.ozlabs.org, kvm-riscv@...ts.infradead.org,
linux-riscv@...ts.infradead.org, linux-kernel@...r.kernel.org,
David Matlack <dmatlack@...gle.com>, David Stevens <stevensd@...omium.org>
Subject: Re: [PATCH v12 64/84] KVM: LoongArch: Mark "struct page" pfns dirty
only in "slow" page fault path
On 2024/8/3 上午3:32, Sean Christopherson wrote:
> On Fri, Aug 02, 2024, maobibo wrote:
>> On 2024/7/27 上午7:52, Sean Christopherson wrote:
>>> Mark pages/folios dirty only the slow page fault path, i.e. only when
>>> mmu_lock is held and the operation is mmu_notifier-protected, as marking a
>>> page/folio dirty after it has been written back can make some filesystems
>>> unhappy (backing KVM guests will such filesystem files is uncommon, and
>>> the race is minuscule, hence the lack of complaints).
>>>
>>> See the link below for details.
>>>
>>> Link: https://lore.kernel.org/all/cover.1683044162.git.lstoakes@gmail.com
>>> Signed-off-by: Sean Christopherson <seanjc@...gle.com>
>>> ---
>>> arch/loongarch/kvm/mmu.c | 18 ++++++++++--------
>>> 1 file changed, 10 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/arch/loongarch/kvm/mmu.c b/arch/loongarch/kvm/mmu.c
>>> index 2634a9e8d82c..364dd35e0557 100644
>>> --- a/arch/loongarch/kvm/mmu.c
>>> +++ b/arch/loongarch/kvm/mmu.c
>>> @@ -608,13 +608,13 @@ static int kvm_map_page_fast(struct kvm_vcpu *vcpu, unsigned long gpa, bool writ
>>> if (kvm_pte_young(changed))
>>> kvm_set_pfn_accessed(pfn);
>>> - if (kvm_pte_dirty(changed)) {
>>> - mark_page_dirty(kvm, gfn);
>>> - kvm_set_pfn_dirty(pfn);
>>> - }
>>> if (page)
>>> put_page(page);
>>> }
>>> +
>>> + if (kvm_pte_dirty(changed))
>>> + mark_page_dirty(kvm, gfn);
>>> +
>>> return ret;
>>> out:
>>> spin_unlock(&kvm->mmu_lock);
>>> @@ -915,12 +915,14 @@ static int kvm_map_page(struct kvm_vcpu *vcpu, unsigned long gpa, bool write)
>>> else
>>> ++kvm->stat.pages;
>>> kvm_set_pte(ptep, new_pte);
>>> - spin_unlock(&kvm->mmu_lock);
>>> - if (prot_bits & _PAGE_DIRTY) {
>>> - mark_page_dirty_in_slot(kvm, memslot, gfn);
>>> + if (writeable)
>> Is it better to use write or (prot_bits & _PAGE_DIRTY) here? writable is
>> pte permission from function hva_to_pfn_slow(), write is fault action.
>
> Marking folios dirty in the slow/full path basically necessitates marking the
> folio dirty if KVM creates a writable SPTE, as KVM won't mark the folio dirty
> if/when _PAGE_DIRTY is set.
>
> Practically speaking, I'm 99.9% certain it doesn't matter. The folio is marked
> dirty by core MM when the folio is made writable, and cleaning the folio triggers
> an mmu_notifier invalidation. I.e. if the page is mapped writable in KVM's
yes, it is. Thanks for the explanation. kvm_set_pfn_dirty() can be put
only in slow page fault path. I only concern with fault type, read fault
type can set pte entry writable however not _PAGE_DIRTY at stage-2 mmu
table.
> stage-2 PTEs, then its folio has already been marked dirty.
Considering one condition although I do not know whether it exists
actually. user mode VMM writes the folio with hva address firstly, then
VCPU thread *reads* the folio. With primary mmu table, pte entry is
writable and _PAGE_DIRTY is set, with secondary mmu table(state-2 PTE
table), it is pte_none since the filio is accessed at first time, so
there will be slow page fault path for stage-2 mmu page table filling.
Since it is read fault, stage-2 PTE will be created with
_PAGE_WRITE(coming from function hva_to_pfn_slow()), however _PAGE_DIRTY
is not set. Do we need call kvm_set_pfn_dirty() at this situation?
Regards
Bibo Mao
Powered by blists - more mailing lists