linux-kernel - Re: [PATCH v1 1/5] KVM: arm64: Enable ring-based dirty memory tracking

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YwTc++Lz6lh3aR4F@xz-m1.local>
Date:   Tue, 23 Aug 2022 09:58:19 -0400
From:   Peter Xu <peterx@...hat.com>
To:     Gavin Shan <gshan@...hat.com>
Cc:     Marc Zyngier <maz@...nel.org>, kvmarm@...ts.cs.columbia.edu,
        linux-arm-kernel@...ts.infradead.org, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org, linux-doc@...r.kernel.org,
        linux-kselftest@...r.kernel.org, pbonzini@...hat.com,
        corbet@....net, james.morse@....com, alexandru.elisei@....com,
        suzuki.poulose@....com, oliver.upton@...ux.dev,
        catalin.marinas@....com, will@...nel.org, shuah@...nel.org,
        seanjc@...gle.com, drjones@...hat.com, dmatlack@...gle.com,
        bgardon@...gle.com, ricarkol@...gle.com, zhenyzha@...hat.com,
        shan.gavin@...il.com
Subject: Re: [PATCH v1 1/5] KVM: arm64: Enable ring-based dirty memory
 tracking

On Tue, Aug 23, 2022 at 03:22:17PM +1000, Gavin Shan wrote:
> > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > index 986cee6fbc7f..0b41feb6fb7d 100644
> > --- a/arch/arm64/kvm/arm.c
> > +++ b/arch/arm64/kvm/arm.c
> > @@ -747,6 +747,12 @@ static int check_vcpu_requests(struct kvm_vcpu *vcpu)
> >   		if (kvm_check_request(KVM_REQ_SUSPEND, vcpu))
> >   			return kvm_vcpu_suspend(vcpu);
> > +
> > +		if (kvm_check_request(KVM_REQ_RING_SOFT_FULL, vcpu)) {
> > +			vcpu->run->exit_reason = KVM_EXIT_DIRTY_RING_FULL;
> > +			trace_kvm_dirty_ring_exit(vcpu);
> > +			return 0;
> > +		}
> >   	}
> >   	return 1;
> > diff --git a/virt/kvm/dirty_ring.c b/virt/kvm/dirty_ring.c
> > index f4c2a6eb1666..08b2f01164fa 100644
> > --- a/virt/kvm/dirty_ring.c
> > +++ b/virt/kvm/dirty_ring.c
> > @@ -149,6 +149,7 @@ int kvm_dirty_ring_reset(struct kvm *kvm, struct kvm_dirty_ring *ring)
> >   void kvm_dirty_ring_push(struct kvm_dirty_ring *ring, u32 slot, u64 offset)
> >   {
> > +	struct kvm_vcpu *vcpu = container_of(ring, struct kvm_vcpu, dirty_ring);
> >   	struct kvm_dirty_gfn *entry;
> >   	/* It should never get full */
> > @@ -166,6 +167,9 @@ void kvm_dirty_ring_push(struct kvm_dirty_ring *ring, u32 slot, u64 offset)
> >   	kvm_dirty_gfn_set_dirtied(entry);
> >   	ring->dirty_index++;
> >   	trace_kvm_dirty_ring_push(ring, slot, offset);
> > +
> > +	if (kvm_dirty_ring_soft_full(vcpu))
> > +		kvm_make_request(KVM_REQ_RING_SOFT_FULL, vcpu);
> >   }
> >   struct page *kvm_dirty_ring_get_page(struct kvm_dirty_ring *ring, u32 offset)
> > 
> 
> Ok, thanks for the details, Marc. I will adopt your code in next revision :)

Note that there can be a slight difference with the old/new code, in that
an (especially malicious) userapp can logically ignore the DIRTY_RING_FULL
vmexit and keep kicking VCPU_RUN with the new code.

Unlike the old code, the 2nd/3rd/... KVM_RUN will still run in the new code
until the next dirty pfn being pushed to the ring, then it'll request ring
full exit again.

Each time it exits the ring grows 1.

At last iiuc it can easily hit the ring full and trigger the warning at the
entry of kvm_dirty_ring_push():

	/* It should never get full */
	WARN_ON_ONCE(kvm_dirty_ring_full(ring));

We did that because kvm_dirty_ring_push() was previously designed to not be
able to fail at all (e.g., in the old bitmap world we never will fail too).
We can't because we can't lose any dirty page or migration could silently
fail too (consider when we do user exit due to ring full and migration just
completed; there could be unsynced pages on src/dst).

So even though the old approach will need to read kvm->dirty_ring_size for
every entrance which is a pity, it will avoid issue above.

Side note: for x86 the dirty ring check was put at the entrance not because
it needs to be the highest priority - it should really be the same when
check kvm requests. It's just that it'll be the fastest way to fail
properly if needed before loading mmu, disabling preemption/irq, etc.

Thanks,

-- 
Peter Xu