[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20231012154439.GM3952@nvidia.com>
Date: Thu, 12 Oct 2023 12:44:39 -0300
From: Jason Gunthorpe <jgg@...dia.com>
To: Will Deacon <will@...nel.org>
Cc: Catalin Marinas <catalin.marinas@....com>,
Lorenzo Pieralisi <lpieralisi@...nel.org>, ankita@...dia.com,
maz@...nel.org, oliver.upton@...ux.dev, aniketa@...dia.com,
cjia@...dia.com, kwankhede@...dia.com, targupta@...dia.com,
vsethi@...dia.com, acurrid@...dia.com, apopple@...dia.com,
jhubbard@...dia.com, danw@...dia.com,
linux-arm-kernel@...ts.infradead.org, kvmarm@...ts.linux.dev,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v1 2/2] KVM: arm64: allow the VM to select DEVICE_* and
NORMAL_NC for IO memory
On Thu, Oct 12, 2023 at 03:48:08PM +0100, Will Deacon wrote:
> I guess my wider point is that I'm not convinced that non-cacheable is
> actually much better and I think we're going way off the deep end looking
> at what particular implementations do and trying to justify to ourselves
> that non-cacheable is safe, even though it's still a normal memory type
> at the end of the day.
When we went over this with ARM it became fairly clear there wasn't an
official statement that Device-* is safe from uncontained
failures. For instance, looking at the actual IP, our architects
pointed out that ARM IP already provides ways for Device-* to trigger
uncontained failures today.
We then mutually concluded that KVM safe implementations must already
be preventing uncontained failures for Device-* at the system level
and that same prevention will carry over to NormalNC as well.
IMHO, this seems to be a gap where ARM has not fully defined when
uncontained failures are allowed and left that as an implementation
choice.
In other words, KVM safety around uncontained failure is not a
property that can be reasoned about from the ARM architecture alone.
> The current wording talks about use-cases (I get this) and error containment
> (it's a property of the system) but doesn't talk at all about why Normal-NC
> is the right result.
Given that Device-* and NormalNC are equally implementation defined
with regards to uncontained failures, NormalNC allows more VM
functionality.
Further, we have a broad agreement that this use case is important,
and that NormalNC is the correct way to adress it.
I think you are right to ask for more formality from ARM team but also
we shouldn't hold up fixing real functional bugs in real shipping
server ARM products.
Jason
Powered by blists - more mailing lists