[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260113161253.GG812923@nvidia.com>
Date: Tue, 13 Jan 2026 12:12:53 -0400
From: Jason Gunthorpe <jgg@...dia.com>
To: Will Deacon <will@...nel.org>
Cc: Nicolin Chen <nicolinc@...dia.com>, robin.murphy@....com,
joro@...tes.org, linux-arm-kernel@...ts.infradead.org,
iommu@...ts.linux.dev, linux-kernel@...r.kernel.org,
skolothumtho@...dia.com, praan@...gle.com,
xueshuai@...ux.alibaba.com, smostafa@...gle.com
Subject: Re: [PATCH rc v5 1/4] iommu/arm-smmu-v3: Add update_safe bits to fix
STE update sequence
On Tue, Jan 13, 2026 at 03:05:52PM +0000, Will Deacon wrote:
> I suppose we shouldn't ever see the case that they both have S2S, but
> that's fine.
If they both have S2S then it works correctly? Any S2S forces EATS to
follow the normal rules.
> The spec also suggests that there's an additional illegal STE case w/
> split-stage ATS (EATS_S1CHK) if Config != S1+S2.
The driver doesn't support that either..
It is fixed by checking if new EATS is valid under old config and old
EATS valid under new config.
Also to support S1CHK someday we cannot allow the hypervisor to leave
S1_S2 and go to S2, since the HW can't deal with that...
> I do wonder whether having all the hitless machinery alongside this
> "safe" stuff is really overkill and we wouldn't be better off just
> checking the cases that we actually care about rather than checking
> architecturally and then subtracting the cases we don't care about.
I'm not sure what you are thinking here. I'd argue that v4 was like
that because it was correct with in the limits of the current driver
capability.
Adding more architectural checks the driver cannot hit today is a nice
future proofing. I don't mind doing it and maybe it will save someone
alot of time down the road.
It isn't like there is some easy shortcut to sequence this someplace
else. Eg the S1CHK stuff above, is very complex in the general
case. We'd have many different versions of EATS with different configs
that can be applied in any sequence.
IMHO two spec derived conditionals is a pretty light cost to deal with
that.
This series originated from customer bugs getting spurious STE faults
because a hitless update in the VM was not hitless in the
hypervisor. This is not just a theoretical need.
I don't want to try to shortcut things to only support a few things we
"think" should be needed and find out later it still causes VM visible
misbehavior :(
Jason
Powered by blists - more mailing lists