netdev - Re: [PATCH v3 6/6] IB/mlx5: Use __iowrite64

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20250715115200.GJ2067380@nvidia.com>
Date: Tue, 15 Jul 2025 08:52:00 -0300
From: Jason Gunthorpe <jgg@...dia.com>
To: Will Deacon <will@...nel.org>
Cc: Catalin Marinas <catalin.marinas@....com>,
	Alexander Gordeev <agordeev@...ux.ibm.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Christian Borntraeger <borntraeger@...ux.ibm.com>,
	Borislav Petkov <bp@...en8.de>,
	Dave Hansen <dave.hansen@...ux.intel.com>,
	"David S. Miller" <davem@...emloft.net>,
	Eric Dumazet <edumazet@...gle.com>,
	Gerald Schaefer <gerald.schaefer@...ux.ibm.com>,
	Vasily Gorbik <gor@...ux.ibm.com>,
	Heiko Carstens <hca@...ux.ibm.com>,
	"H. Peter Anvin" <hpa@...or.com>,
	Justin Stitt <justinstitt@...gle.com>,
	Jakub Kicinski <kuba@...nel.org>, Leon Romanovsky <leon@...nel.org>,
	linux-rdma@...r.kernel.org, linux-s390@...r.kernel.org,
	llvm@...ts.linux.dev, Ingo Molnar <mingo@...hat.com>,
	Bill Wendling <morbo@...gle.com>,
	Nathan Chancellor <nathan@...nel.org>,
	Nick Desaulniers <ndesaulniers@...gle.com>, netdev@...r.kernel.org,
	Paolo Abeni <pabeni@...hat.com>,
	Salil Mehta <salil.mehta@...wei.com>,
	Sven Schnelle <svens@...ux.ibm.com>,
	Thomas Gleixner <tglx@...utronix.de>, x86@...nel.org,
	Yisen Zhuang <yisen.zhuang@...wei.com>,
	Arnd Bergmann <arnd@...db.de>,
	Leon Romanovsky <leonro@...lanox.com>, linux-arch@...r.kernel.org,
	linux-arm-kernel@...ts.infradead.org,
	Mark Rutland <mark.rutland@....com>,
	Michael Guralnik <michaelgur@...lanox.com>, patches@...ts.linux.dev,
	Niklas Schnelle <schnelle@...ux.ibm.com>,
	Jijie Shao <shaojijie@...wei.com>
Subject: Re: [PATCH v3 6/6] IB/mlx5: Use __iowrite64_copy() for write
 combining stores

On Tue, Jul 15, 2025 at 11:15:25AM +0100, Will Deacon wrote:
> > Since STP was rejected alread we've only tested the Neon version. It
> > does make a huge improvement, but it still somehow fails to combine
> > rarely sometimes. The CPU is really bad at this :(
> 
> I think the thread was from last year so I've forgotten most of the
> details, but wasn't STP rejected because it wasn't virtualisable? 

Yes, that was the claim.

> In which case, doesn't NEON suffer from exactly the same (or possibly
> worse) problem?

In general yes, in specific no.

mlx5 (and other RDMA devices) have long used Neon for MMIO in
userspace, so any VMM assigning mlx5 devices simply must make this
work - it is already not optional. So we know that all VMs out there
with mlx5 support neon for mlx5, and it is safe for mlx5 to use.

Typically this is trivally done in a VMM by never emulating mlx5's
MMIO space. If the VMM takes a fault on a MMIO page it fixes the fault
and restarts the neon instruction.

The generality was the notion that there could be other devices in a
VM that are fully emulated and using these challenging instructions
would break the simple emulation. This is why the general purpose
__iowrite64_copy() didn't use STP.

> Also, have you managed to investigate why the CPU tends not to get this
> right? 

I have asked but our CPU architects have said it is too complex to
analyze, but they admit it doesn't work entirely well :(

The belief is some micro-architectural condition is breaking it as we
see even neon instructions failing during every test.

They say it is fully fixed with ST64B in the future.

> Do we e.g. end up taking interrupts/exceptions while the self
> test is running or something like that?

I doubt it, the test is running in kernel mode during boot for
hundreds of iterations. An interrupt on every interation is not
likely. Any single successful combine is a pass for the test.

Even an interrupt shouldn't disrupt a single instruction Neon store,
yet we can still mesure a low rate of neon failures.

> Sorry for the wall of questions!

No worries! It's weird and definately complicated.

Jason