lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZowGFl/1AEuevh96@e133380.arm.com>
Date: Mon, 8 Jul 2024 16:30:30 +0100
From: Dave Martin <Dave.Martin@....com>
To: Mark Brown <broonie@...nel.org>
Cc: Marc Zyngier <maz@...nel.org>,
	Catalin Marinas <catalin.marinas@....com>,
	Will Deacon <will@...nel.org>,
	Oliver Upton <oliver.upton@...ux.dev>,
	James Morse <james.morse@....com>,
	Suzuki K Poulose <suzuki.poulose@....com>,
	Fuad Tabba <tabba@...gle.com>, linux-arm-kernel@...ts.infradead.org,
	linux-kernel@...r.kernel.org, kvmarm@...ts.linux.dev
Subject: Re: [PATCH v4 0/4] KVM: arm64: Fix underallocation of storage for
 SVE state

Hi all,

On Fri, Jul 05, 2024 at 06:18:50PM +0100, Mark Brown wrote:
> On Fri, Jul 05, 2024 at 02:20:05PM +0100, Marc Zyngier wrote:
> > Mark Brown <broonie@...nel.org> wrote:
> 
> > > As observed during review the pKVM support for saving host SVE state is
> > > broken if an asymmetric system has VLs larger than the maximum shared
> > > VL, fix this by discovering then using the maximum VL for allocations
> > > and using RDVL during the save/restore process.
> 
> > I really don't see why we need such complexity here.

The first patch is orthogonal cleanup, and the rest doesn't really add
complexity IIUC.

> > Fuad did post something[1] that did the trick with a far less invasive
> > change, and it really feels like we are putting the complexity at the
> > wrong place.
> 
> > So what's wrong with that approach? I get that you want to shout about
> > secondary CPUs, but that's an orthogonal problem.
> 
> As I've said from a clarity/fragility point of view I'm not happy with
> configuring the vector length to one value then immediately doing things
> that assume another value, even if everything is actually lined up
> in a way that works.  Having uncommented code where you have to go and
> check correctness when you see it isn't great, seeing an inconsistency
> just raises alarm bells.  It is much clearer to write the code in a way
> that makes it obvious that the VL we are using is the one the hardware
> is using, for the host save/restore reading the actual VL back seemed
> like the most straightforward way to do that.
> 
> A similar thing applies with the enumeration code - like I said in reply
> to one of Fuad's postings I originally wrote something that's basically
> the same as the patch Faud posted but because it is not consistent with
> the surrounding code in how it approaches things it was just raising
> questions about if the new code was missing something, or if there was
> some problem that should be addressed in the existing code.  Rather than
> write an extensive changelog and/or comments covering these
> considerations it seemed more better to just write the code in a
> consistent manner so the questions aren't prompted.  Keeping the
> approach consistent is a bit more code right now but makes the result
> much easier to reason about.
> 
> The late CPUs thing is really just an answer to the initial "why is this
> different, what might we have missed?" question rather than a particular
> goal itself.  Adding a warning is as much about documenting the handling
> of late CPUs as it is about highlighting any unfortunate implementation
> choices we run into.
> 
> Basically it's maintainability concerns, especially with the enumeration
> code.

I tend to agree here.

It's probably best to stick to one convention everywhere about how the
SVE regs are laid out for a given VL.  There's nothing wrong with Fuad's
fixed sve_ffr_offset(), but it's different from the VL-dependent offset
already used elsewhere and so risks causing confusion further down the
line.


One thing confuses me:

The host could never use over-max VLs except in non-preemptible kernel
code, since code doing that would be non-migratable to other physical
CPUs.  This is done to probe SVE only, and the extra bits in the vector
registers are never used at all.

Can't pKVM just hide the non symmetrically supported VLs using ZCR_EL2,
just as regular KVM does for the guest?

(I may be making bad assumptions about pKVM's relationship with the host
kernel.)

Cheers
---Dave

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ