lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zr09cyPZNShzeZc6@linux.dev>
Date: Wed, 14 Aug 2024 16:27:47 -0700
From: Oliver Upton <oliver.upton@...ux.dev>
To: Sean Christopherson <seanjc@...gle.com>
Cc: Jason Gunthorpe <jgg@...dia.com>, Peter Xu <peterx@...hat.com>,
	linux-mm@...ck.org, linux-kernel@...r.kernel.org,
	Oscar Salvador <osalvador@...e.de>,
	Axel Rasmussen <axelrasmussen@...gle.com>,
	linux-arm-kernel@...ts.infradead.org, x86@...nel.org,
	Will Deacon <will@...nel.org>, Gavin Shan <gshan@...hat.com>,
	Paolo Bonzini <pbonzini@...hat.com>, Zi Yan <ziy@...dia.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Catalin Marinas <catalin.marinas@....com>,
	Ingo Molnar <mingo@...hat.com>,
	Alistair Popple <apopple@...dia.com>,
	Borislav Petkov <bp@...en8.de>,
	David Hildenbrand <david@...hat.com>,
	Thomas Gleixner <tglx@...utronix.de>, kvm@...r.kernel.org,
	Dave Hansen <dave.hansen@...ux.intel.com>,
	Alex Williamson <alex.williamson@...hat.com>,
	Yan Zhao <yan.y.zhao@...el.com>, Marc Zyngier <maz@...nel.org>
Subject: Re: [PATCH 00/19] mm: Support huge pfnmaps

On Wed, Aug 14, 2024 at 01:54:04PM -0700, Sean Christopherson wrote:
> TL;DR: it's probably worth looking at mmu_stress_test (was: max_guest_memory_test)
> on arm64, specifically the mprotect() testcase[1], as performance is significantly
> worse compared to x86,

Sharing what we discussed offline:

Sean was using a machine w/o FEAT_FWB for this test, so the increased
runtime on arm64 is likely explained by the CMOs we're doing when
creating or invalidating a stage-2 PTE.

Using a machine w/ FEAT_FWB would be better for making these sort of
cross-architecture comparisons. Beyond CMOs, we do have some 

> and there might be bugs lurking the mmu_notifier flows.

Impossible! :)

> Jumping back to mmap_lock, adding a lock, vma_lookup(), and unlock in x86's page
> fault path for valid VMAs does introduce a performance regression, but only ~30%,
> not the ~6x jump from x86 to arm64.  So that too makes it unlikely taking mmap_lock
> is the main problem, though it's still good justification for avoid mmap_lock in
> the page fault path.

I'm curious how much of that 30% in a microbenchmark would translate to
real world performance, since it isn't *that* egregious. We also have
other uses for getting at the VMA beyond mapping granularity (MTE and
the VFIO Normal-NC hint) that'd require some attention too.

-- 
Thanks,
Oliver

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ