[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CADrL8HW+4Yq-wBr1+DzJvSwRRL_hqt5RaCCLgOQndPGUqoX+Rg@mail.gmail.com>
Date: Fri, 19 Apr 2024 13:57:03 -0700
From: James Houghton <jthoughton@...gle.com>
To: David Matlack <dmatlack@...gle.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>, Paolo Bonzini <pbonzini@...hat.com>,
Yu Zhao <yuzhao@...gle.com>, Marc Zyngier <maz@...nel.org>,
Oliver Upton <oliver.upton@...ux.dev>, Sean Christopherson <seanjc@...gle.com>,
Jonathan Corbet <corbet@....net>, James Morse <james.morse@....com>,
Suzuki K Poulose <suzuki.poulose@....com>, Zenghui Yu <yuzenghui@...wei.com>,
Catalin Marinas <catalin.marinas@....com>, Will Deacon <will@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>, "H. Peter Anvin" <hpa@...or.com>,
Steven Rostedt <rostedt@...dmis.org>, Masami Hiramatsu <mhiramat@...nel.org>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>, Shaoqin Huang <shahuang@...hat.com>,
Gavin Shan <gshan@...hat.com>, Ricardo Koller <ricarkol@...gle.com>,
Raghavendra Rao Ananta <rananta@...gle.com>, Ryan Roberts <ryan.roberts@....com>,
David Rientjes <rientjes@...gle.com>, Axel Rasmussen <axelrasmussen@...gle.com>,
linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
kvmarm@...ts.linux.dev, kvm@...r.kernel.org, linux-mm@...ck.org,
linux-trace-kernel@...r.kernel.org
Subject: Re: [PATCH v3 0/7] mm/kvm: Improve parallelism for access bit harvesting
On Fri, Apr 12, 2024 at 11:41 AM David Matlack <dmatlack@...gle.com> wrote:
>
> On 2024-04-01 11:29 PM, James Houghton wrote:
> > This patchset adds a fast path in KVM to test and clear access bits on
> > sptes without taking the mmu_lock. It also adds support for using a
> > bitmap to (1) test the access bits for many sptes in a single call to
> > mmu_notifier_test_young, and to (2) clear the access bits for many ptes
> > in a single call to mmu_notifier_clear_young.
>
> How much improvement would we get if we _just_ made test/clear_young
> lockless on x86 and hold the read-lock on arm64? And then how much
> benefit does the bitmap look-around add on top of that?
I don't have these results right now. For the next version I will (1)
separate the series into the locking change and the bitmap change, and
I will (2) have performance data for each change separately. It is
conceivable that the bitmap change should just be considered as a
completely separate patchset.
Powered by blists - more mailing lists