linux-kernel - Re: [PATCH v1 2/2] KVM: arm64: allow the VM to select DEVICE_* and NORMAL

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20231013092934.GA13524@willie-the-truck>
Date:   Fri, 13 Oct 2023 10:29:35 +0100
From:   Will Deacon <will@...nel.org>
To:     Catalin Marinas <catalin.marinas@....com>
Cc:     Lorenzo Pieralisi <lpieralisi@...nel.org>,
        Jason Gunthorpe <jgg@...dia.com>, ankita@...dia.com,
        maz@...nel.org, oliver.upton@...ux.dev, aniketa@...dia.com,
        cjia@...dia.com, kwankhede@...dia.com, targupta@...dia.com,
        vsethi@...dia.com, acurrid@...dia.com, apopple@...dia.com,
        jhubbard@...dia.com, danw@...dia.com,
        linux-arm-kernel@...ts.infradead.org, kvmarm@...ts.linux.dev,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v1 2/2] KVM: arm64: allow the VM to select DEVICE_* and
 NORMAL_NC for IO memory

On Thu, Oct 12, 2023 at 06:26:01PM +0100, Catalin Marinas wrote:
> On Thu, Oct 12, 2023 at 03:48:08PM +0100, Will Deacon wrote:
> > Claiming back the device also seems strange if the guest has been using
> > non-cacheable accesses since I think you could get write merging and
> > reordering with subsequent device accesses trying to reset the device.
> 
> True. Not sure we have a good story here (maybe reinvent the DWB barrier ;)).

We do have a good story for this part: use Device-nGnRE!

> > > So, for now I'd only relax this if we know there's RAM(-like) on the
> > > other side and won't trigger some potentially uncontainable errors as a
> > > result.
> > 
> > I guess my wider point is that I'm not convinced that non-cacheable is
> > actually much better and I think we're going way off the deep end looking
> > at what particular implementations do and trying to justify to ourselves
> > that non-cacheable is safe, even though it's still a normal memory type
> > at the end of the day.
> 
> Is this about Device vs NC or Device/NC vs Normal Cacheable? The
> justification for the former has been summarised in Lorenzo's write-up.
> How the hardware behaves, it depends a lot on the RAS implementation.
> The BSA has some statements but not sure it covers everything.

I don't know how to be more clear, but I'm asking why Normal-NC is the right
memory type to use. Jason's explanation on the other thread about how it's
basically the only option with FWB (with some hand-waving about why
Normal-cacheable isn't safe) will have to do, but it needs to be in the
commit message.

The host maps MMIO with Device-nGnRE. Allowing a guest to map it as Normal
surely means the host is going to need additional logic to deal with that?
We briefly discussed claiming back a device above and I'm not convinced
that the code we have for doing that today will work correctly if the
guest has issued a bunch of Normal-NC stores prior to the device being
unmapped.

Could we change these patches so that the memory type of the stage-1 VMA
in the VMM is reflected in the stage-2? In other words, continue to use
Device mappings at stage-2 for I/O but relax to Normal-NC if that's
how the VMM has it mapped?

> Things can go wrong but that's not because Device does anything better.
> Given the RAS implementation, external aborts caused on Device memory
> (e.g. wrong size access) is uncontainable. For Normal NC it can be
> contained (I can dig out the reasoning behind this if you want, IIUC
> something to do with not being able to cancel an already issued Device
> access since such accesses don't allow speculation due to side-effects;
> for Normal NC, it's just about the software not getting the data).

I really think these details belong in the commit message.

> > Obviously, it's up to Marc and Oliver if they want to do this, but I'm
> > wary without an official statement from Arm to say that Normal-NC is
> > correct. There's mention of such a statement in the cover letter:
> > 
> >   > We hope ARM will publish information helping platform designers
> >   > follow these guidelines.
> > 
> > but imo we shouldn't merge this without either:
> > 
> >   (a) _Architectural_ guidance (as opposed to some random whitepaper or
> >       half-baked certification scheme).
> 
> Well, you know the story, the architects will probably make it a SoC or
> integration issue, PCIe etc., not something that can live in the Arm
> ARM. The best we could get is more recommendations in the RAS spec
> around containment but not for things that might happen outside the CPU,
> e.g. PCIe root complex.

The Arm ARM _does_ mention PCI config space when talking about early write
acknowledgement, so there's some precedence for providing guidance around
which memory types to use.

> > - or -
> > 
> >   (b) A concrete justification based on the current architecture as to
> >       why Normal-NC is the right thing to do for KVM.
> 
> To put it differently, we don't have any strong arguments why Device is
> the right thing to do. We chose Device based on some understanding
> software people had about how the hardware behaves, which apparently
> wasn't entirely correct (and summarised by Lorenzo).

I think we use Device because that's what the host uses in its stage-1
and mismatched aliases are bad.

Will