lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zl-cjHVKaQ0iQE5d@linux.dev>
Date: Tue, 4 Jun 2024 23:00:28 +0000
From: Oliver Upton <oliver.upton@...ux.dev>
To: James Houghton <jthoughton@...gle.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
	Paolo Bonzini <pbonzini@...hat.com>,
	Albert Ou <aou@...s.berkeley.edu>,
	Ankit Agrawal <ankita@...dia.com>, Anup Patel <anup@...infault.org>,
	Atish Patra <atishp@...shpatra.org>,
	Axel Rasmussen <axelrasmussen@...gle.com>,
	Bibo Mao <maobibo@...ngson.cn>,
	Catalin Marinas <catalin.marinas@....com>,
	David Matlack <dmatlack@...gle.com>,
	David Rientjes <rientjes@...gle.com>,
	Huacai Chen <chenhuacai@...nel.org>,
	James Morse <james.morse@....com>, Jonathan Corbet <corbet@....net>,
	Marc Zyngier <maz@...nel.org>,
	Michael Ellerman <mpe@...erman.id.au>,
	Nicholas Piggin <npiggin@...il.com>,
	Palmer Dabbelt <palmer@...belt.com>,
	Paul Walmsley <paul.walmsley@...ive.com>,
	Raghavendra Rao Ananta <rananta@...gle.com>,
	Ryan Roberts <ryan.roberts@....com>,
	Sean Christopherson <seanjc@...gle.com>,
	Shaoqin Huang <shahuang@...hat.com>, Shuah Khan <shuah@...nel.org>,
	Suzuki K Poulose <suzuki.poulose@....com>,
	Tianrui Zhao <zhaotianrui@...ngson.cn>,
	Will Deacon <will@...nel.org>, Yu Zhao <yuzhao@...gle.com>,
	Zenghui Yu <yuzenghui@...wei.com>, kvm-riscv@...ts.infradead.org,
	kvm@...r.kernel.org, kvmarm@...ts.linux.dev,
	linux-arm-kernel@...ts.infradead.org, linux-doc@...r.kernel.org,
	linux-kernel@...r.kernel.org, linux-kselftest@...r.kernel.org,
	linux-mips@...r.kernel.org, linux-mm@...ck.org,
	linux-riscv@...ts.infradead.org, linuxppc-dev@...ts.ozlabs.org,
	loongarch@...ts.linux.dev
Subject: Re: [PATCH v4 6/7] KVM: arm64: Relax locking for kvm_test_age_gfn
 and kvm_age_gfn

On Tue, Jun 04, 2024 at 03:20:20PM -0700, James Houghton wrote:
> On Fri, May 31, 2024 at 12:18 PM Oliver Upton <oliver.upton@...ux.dev> wrote:
> >
> > On Fri, May 31, 2024 at 12:11:33PM -0700, Oliver Upton wrote:
> > > On Wed, May 29, 2024 at 06:05:09PM +0000, James Houghton wrote:
> > >
> > > [...]
> > >
> > > > diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
> > > > index 9e2bbee77491..eabb07c66a07 100644
> > > > --- a/arch/arm64/kvm/hyp/pgtable.c
> > > > +++ b/arch/arm64/kvm/hyp/pgtable.c
> > > > @@ -1319,10 +1319,8 @@ static int stage2_age_walker(const struct kvm_pgtable_visit_ctx *ctx,
> > > >     data->young = true;
> > > >
> > > >     /*
> > > > -    * stage2_age_walker() is always called while holding the MMU lock for
> > > > -    * write, so this will always succeed. Nonetheless, this deliberately
> > > > -    * follows the race detection pattern of the other stage-2 walkers in
> > > > -    * case the locking mechanics of the MMU notifiers is ever changed.
> > > > +    * This walk may not be exclusive; the PTE is permitted to change
> > > > +    * from under us.
> > > >      */
> > > >     if (data->mkold && !stage2_try_set_pte(ctx, new))
> > > >             return -EAGAIN;
> > >
> > > It is probably worth mentioning that if there was a race to update the
> > > PTE then the GFN is most likely young, so failing to clear AF probably
> > > isn't even consequential.
> 
> Thanks Oliver.
> 
> >
> > Oh, and the WARN_ON() in kvm_pgtable_stage2_test_clear_young() is bogus
> > now. Maybe demote it to:
> >
> >   r = kvm_pgtable_walk(...);
> >   WARN_ON_ONCE(r && r != -EAGAIN);
> 
> Oh, indeed, thank you. Just to make sure -- does it make sense to
> retry the cmpxchg if it fails? For example, the way I have it now for
> x86[1], we retry the cmpxchg if the spte is still a leaf, otherwise we
> move on to the next one having done nothing. Does something like that
> make sense for arm64?

At least for arm64 I do not see a need for retry. The only possible
races are:

 - A stage-2 fault handler establishing / adjusting the mapping for the
   GFN. If the guest is directly accessing the GFN in question, what's
   the point of wiping out AF?

   Even when returning -EAGAIN we've already primed stage2_age_data::young,
   so we report the correct state back to the primary MMU.

 - Another kvm_age_gfn() trying to age the same GFN. I haven't even
   looked to see if this is possible from the primary MMU POV, but in
   theory one of the calls will win the race and clear AF.

Given Yu's concerns about making pending writers wait, we should take
every opportunity to bail on the walk.

-- 
Thanks,
Oliver

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ