lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241206205602.7phcrxqsv4c6oul4@altlinux.org>
Date: Fri, 6 Dec 2024 23:56:02 +0300
From: Vitaly Chikunov <vt@...linux.org>
To: Marc Zyngier <maz@...nel.org>
Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@...wei.com>,
	Will Deacon <will@...nel.org>,
	"james.morse@....com" <james.morse@....com>,
	"linux-arm-kernel@...ts.infradead.org" <linux-arm-kernel@...ts.infradead.org>,
	Catalin Marinas <catalin.marinas@....com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"oliver.upton@...ux.dev" <oliver.upton@...ux.dev>,
	"mark.rutland@....com" <mark.rutland@....com>,
	"Wangzhou (B)" <wangzhou1@...ilicon.com>,
	Gleb Fotengauer-Malinovskiy <glebfm@...linux.org>
Subject: Re: v6.13-rc1: Internal error: Oops - Undefined instruction:
 0000000002000000 [#1] SMP

Marc,

On Wed, Dec 04, 2024 at 08:51:26AM +0000, Marc Zyngier wrote:
> On Tue, 03 Dec 2024 22:14:53 +0000,
> Vitaly Chikunov <vt@...linux.org> wrote:
> > 
> > Shameer, Marc, Oliver, Will,
> > 
> > On Tue, Dec 03, 2024 at 10:03:11AM +0000, Shameerali Kolothum Thodi wrote:
> > > > -----Original Message-----
> > > > From: linux-arm-kernel <linux-arm-kernel-bounces@...ts.infradead.org> On
> > > > Behalf Of Vitaly Chikunov
> > > > Sent: Tuesday, December 3, 2024 9:27 AM
> > > > To: Marc Zyngier <maz@...nel.org>
> > > > Cc: Will Deacon <will@...nel.org>; james.morse@....com; linux-arm-
> > > > kernel@...ts.infradead.org; Catalin Marinas <catalin.marinas@....com>;
> > > > linux-kernel@...r.kernel.org; oliver.upton@...ux.dev;
> > > > mark.rutland@....com
> > > > Subject: Re: v6.13-rc1: Internal error: Oops - Undefined instruction:
> > > > 0000000002000000 [#1] SMP
> > > > 
> > > > Marc,
> > > > 
> > > > On Tue, Dec 03, 2024 at 01:31:19AM +0300, Vitaly Chikunov wrote:
> > > > > On Mon, Dec 02, 2024 at 04:07:03PM +0000, Marc Zyngier wrote:
> > > > > > On Mon, 02 Dec 2024 15:59:40 +0000,
> > > > > > Vitaly Chikunov <vt@...linux.org> wrote:
> > > > > > >
> > > > > > > Marc,
> > > > > > >
> > > > > > > On Mon, Dec 02, 2024 at 03:53:59PM +0000, Marc Zyngier wrote:
> > > > > > > >
> > > > > > > > What the log doesn't say is what the host is. Is it 6.13-rc1 as well?
> > > > > > >
> > > > > > > No, host is 6.6.60.
> > > > > >
> > > > > > Right. I wouldn't be surprised if:
> > > > > >
> > > > > > - this v6.6 kernel doesn't hide the MPAM feature as it should (and
> > > > > >   that's proably something we should backport)
> > > > >
> > > > > How to confirm this? Currently I cannot find any (case-insensitive)
> > > > > "MPAM" files in /sys, nor mpam string in /proc/cpuinfo, nor MPAM
> > > > > strings in `strace -v` (as it decodes some KVM ioctls) of qemu process.
> > > > >
> > > > > >
> > > > > > - you get a nastygram in the host log telling you that the guest has
> > > > > >   executed something it shouldn't (you'll get the encoding of the
> > > > > >   instruction)
> > > > >
> > > > > I requested admins of the box for dmesg output since I don't have root
> > > > > access myself and nowadays dmesg is not accessible for a user.
> > > > 
> > > > This is what they reported:
> > > > 
> > > >   kvm [2502822]: Unsupported guest sys_reg access at: ffff80008003e9f0
> > > > [000000c5]
> > > >                    { Op0( 3), Op1( 0), CRn(10), CRm( 4), Op2( 4), func_read },
> > > > 
> > > 
> > > As Will pointed out I think this is access to MPAMIDR_EL1 and is from this
> > > code here,
> > > 
> > > +++ b/arch/arm64/kernel/cpuinfo.c
> > > @@ -478,6 +478,9 @@ static void __cpuinfo_store_cpu(struct cpuinfo_arm64 *info)
> > >  	if (id_aa64pfr0_32bit_el0(info->reg_id_aa64pfr0))
> > >  		__cpuinfo_store_cpu_32bit(&info->aarch32);
> > >  
> > > +	if (id_aa64pfr0_mpam(info->reg_id_aa64pfr0))
> > > +		info->reg_mpamidr = read_cpuid(MPAMIDR_EL1);
> > > +
> > >  	cpuinfo_detect_icache_policy(info);
> > >  }
> > > 
> > > I did manage to boot my setup in 6.6 and this is what happens,
> > > 
> > > Host kernel 6.6
> > > Guest Kernel 6.13-rc1
> > > 
> > > [    0.195392] smp: Brought up 1 node, 8 CPUs
> > > [    0.219000] SMP: Total of 8 processors activated.
> > > [    0.219629] CPU: All CPU(s) started at EL1
> > > ...
> > > [    0.223212] CPU features: detected: RAS Extension Support
> > > [    0.223927] CPU features: detected: Memory Partitioning And Monitoring
> > > [    0.224796] CPU features: detected: Memory Partitioning And Monitoring Virtualisation
> > > [    0.225961] alternatives: applying system-wide alternatives
> > > ...
> > > 
> > > Guest detects MPAM and boots fine.
> > > 
> > > Host kernel 6.13-rc1
> > > Guest Kernel 6.13-rc1
> > > 
> > > [    0.196625] smp: Brought up 1 node, 8 CPUs
> > > [    0.222093] SMP: Total of 8 processors activated.
> > > [    0.222769] CPU: All CPU(s) started at EL1
> > > ...
> > > [    0.226620] CPU features: detected: RAS Extension Support
> > > [    0.227453] alternatives: applying system-wide alternatives
> > > 
> > > MPAM is not visible to Guest in this case.
> > > 
> > > So as I pointed out earlier could it be a case where the ID register reports MPAM support
> > > but the firmware has not enabled MPAM?
> > > 
> > > James seems to be mentioning that case here,
> > > 
> > > " (If you have a boot failure that bisects here its likely your CPUs
> > > advertise MPAM in the id registers, but firmware failed to either enable
> > > or MPAM, or emulate the trap as if it were disabled)"
> > 
> > I tried to verify that MPAM is advertised with qemu+gdb method, as
> > suggested by Oliver, but ID_AA64PFR0_EL1 register is not there.
> > 
> >   (gdb) i r ID_AA64PFR0_EL1
> >   Invalid register `ID_AA64PFR0_EL1'
> 
> Then there is a bug in either QEMU or the GDB stubs. This register
> exists, or you wouldn't be here.
> 
> > 
> > Are there other suggestions?
> 
> Mark has described what the problem is likely to be. 6.6-stable needs
> to have 6685f5d572c22e10 backported, and it probably should have been
> Cc: to stable. Can you please apply the following patch to your *host*
> machine and retest?

We tested the host with this patch applied over 6.6.63 and 6.13-rc1
guest does not Oops anymore.

I'd suggest this is also get backported to 6.12.y branch.

Thanks,

> 
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index 370a1a7bd369..258a39bcd3c7 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -1330,6 +1330,7 @@ static u64 __kvm_read_sanitised_id_reg(const struct kvm_vcpu *vcpu,
>  			val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_MTE);
>  
>  		val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_SME);
> +		val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_MPAM_frac);
>  		break;
>  	case SYS_ID_AA64ISAR1_EL1:
>  		if (!vcpu_has_ptrauth(vcpu))
> @@ -1472,6 +1473,13 @@ static u64 read_sanitised_id_aa64pfr0_el1(struct kvm_vcpu *vcpu,
>  
>  	val &= ~ID_AA64PFR0_EL1_AMU_MASK;
>  
> +	/*
> +	 * MPAM is disabled by default as KVM also needs a set of PARTID to
> +	 * program the MPAMVPMx_EL2 PARTID remapping registers with. But some
> +	 * older kernels let the guest see the ID bit.
> +	 */
> +	val &= ~ID_AA64PFR0_EL1_MPAM_MASK;
> +
>  	return val;
>  }
>  
> @@ -1560,6 +1568,29 @@ static int set_id_dfr0_el1(struct kvm_vcpu *vcpu,
>  	return set_id_reg(vcpu, rd, val);
>  }
>  
> +static int set_id_aa64pfr0_el1(struct kvm_vcpu *vcpu,
> +			       const struct sys_reg_desc *rd, u64 user_val)
> +{
> +	u64 hw_val = read_sanitised_ftr_reg(SYS_ID_AA64PFR0_EL1);
> +	u64 mpam_mask = ID_AA64PFR0_EL1_MPAM_MASK;
> +
> +	/*
> +	 * Commit 011e5f5bf529f ("arm64/cpufeature: Add remaining feature bits
> +	 * in ID_AA64PFR0 register") exposed the MPAM field of AA64PFR0_EL1 to
> +	 * guests, but didn't add trap handling. KVM doesn't support MPAM and
> +	 * always returns an UNDEF for these registers. The guest must see 0
> +	 * for this field.
> +	 *
> +	 * But KVM must also accept values from user-space that were provided
> +	 * by KVM. On CPUs that support MPAM, permit user-space to write
> +	 * the sanitizied value to ID_AA64PFR0_EL1.MPAM, but ignore this field.
> +	 */
> +	if ((hw_val & mpam_mask) == (user_val & mpam_mask))
> +		user_val &= ~ID_AA64PFR0_EL1_MPAM_MASK;
> +
> +	return set_id_reg(vcpu, rd, user_val);
> +}
> +
>  /*
>   * cpufeature ID register user accessors
>   *
> @@ -2018,7 +2049,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {
>  	{ SYS_DESC(SYS_ID_AA64PFR0_EL1),
>  	  .access = access_id_reg,
>  	  .get_user = get_id_reg,
> -	  .set_user = set_id_reg,
> +	  .set_user = set_id_aa64pfr0_el1,
>  	  .reset = read_sanitised_id_aa64pfr0_el1,
>  	  .val = ID_AA64PFR0_EL1_CSV2_MASK | ID_AA64PFR0_EL1_CSV3_MASK, },
>  	ID_SANITISED(ID_AA64PFR1_EL1),
> 
> > > https://lore.kernel.org/all/20241030160317.2528209-4-joey.gouly@arm.com/
> > > 
> > > Is there a way you can find out the BIOS version on that board?
> > 
> > Unfortunately, admins of the server do not provide me with this
> > info.
> 
> This doesn't really help, I'm afraid.
> 
> > For such cases, when MPAM is incorrectly advertised, can we have kernel
> > command line parameter like mpam=0 to override it's detection?
> 
> We could, but only when we can confirm what the problem is.
> 
> > I think with "If you have a boot failure that bisects here" it's
> > acknowledged possibility and it's confirmed by our server.
> 
> Not really. This talks about firmware. We are debugging the hypervisor
> here. This might be closely related, but these are not the same
> things.
> 
> Thanks,
> 
> 	M.
> 
> -- 
> Without deviation from the norm, progress is not possible.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ