lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+fCnZcknrhCOskgLLcTn_-o5jSiQsFni7ihMWuc1Qsd-Pu7gg@mail.gmail.com>
Date: Wed, 8 Oct 2025 23:36:16 +0200
From: Andrey Konovalov <andreyknvl@...il.com>
To: Yunseong Kim <ysk@...lloc.com>
Cc: Catalin Marinas <catalin.marinas@....com>, Will Deacon <will@...nel.org>, 
	James Morse <james.morse@....com>, Yeoreum Yun <yeoreum.yun@....com>, 
	Vincenzo Frascino <vincenzo.frascino@....com>, Marc Zyngier <maz@...nel.org>, 
	Mark Brown <broonie@...nel.org>, Oliver Upton <oliver.upton@...ux.dev>, 
	Ard Biesheuvel <ardb@...nel.org>, Andrey Ryabinin <ryabinin.a.a@...il.com>, 
	Alexander Potapenko <glider@...gle.com>, Dmitry Vyukov <dvyukov@...gle.com>, 
	linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org, 
	kasan-dev@...glegroups.com
Subject: Re: [PATCH] arm64: cpufeature: Don't cpu_enable_mte() when
 KASAN_GENERIC is active

On Wed, Oct 8, 2025 at 11:13 PM Yunseong Kim <ysk@...lloc.com> wrote:
>
> When a kernel built with CONFIG_KASAN_GENERIC=y is booted on MTE-capable
> hardware, a kernel panic occurs early in the boot process. The crash
> happens when the CPU feature detection logic attempts to enable the Memory
> Tagging Extension (MTE) via cpu_enable_mte().
>
> Because the kernel is instrumented by the software-only Generic KASAN,
> the code within cpu_enable_mte() itself is instrumented. This leads to
> a fatal memory access fault within KASAN's shadow memory region when
> the MTE initialization is attempted. Currently, the only workaround is
> to boot with the "arm64.nomte" kernel parameter.
>
> This bug was discovered during work on supporting the Debian debug kernel
> on the Arm v9.2 RADXA Orion O6 board:
>
>  https://salsa.debian.org/kernel-team/linux/-/merge_requests/1670
>
> Related kernel configs:
>
>  CONFIG_ARM64_AS_HAS_MTE=y
>  CONFIG_ARM64_MTE=y
>
>  CONFIG_KASAN_SHADOW_OFFSET=0xdfff800000000000
>  CONFIG_HAVE_ARCH_KASAN=y
>  CONFIG_HAVE_ARCH_KASAN_SW_TAGS=y
>  CONFIG_HAVE_ARCH_KASAN_HW_TAGS=y
>  CONFIG_HAVE_ARCH_KASAN_VMALLOC=y
>  CONFIG_CC_HAS_KASAN_GENERIC=y
>  CONFIG_CC_HAS_KASAN_SW_TAGS=y
>
>  CONFIG_KASAN=y
>  CONFIG_CC_HAS_KASAN_MEMINTRINSIC_PREFIX=y
>  CONFIG_KASAN_GENERIC=y
>
> The panic log clearly shows the conflict:
>
> [    0.000000] kasan: KernelAddressSanitizer initialized (generic)
> [    0.000000] psci: probing for conduit method from ACPI.
> [    0.000000] psci: PSCIv1.1 detected in firmware.
> [    0.000000] psci: Using standard PSCI v0.2 function IDs
> [    0.000000] psci: Trusted OS migration not required
> [    0.000000] psci: SMC Calling Convention v1.2
> [    0.000000] percpu: Embedded 486 pages/cpu s1950104 r8192 d32360 u1990656
> [    0.000000] pcpu-alloc: s1950104 r8192 d32360 u1990656 alloc=486*4096
> [    0.000000] pcpu-alloc: [0] 00 [0] 01 [0] 02 [0] 03 [0] 04 [0] 05 [0] 06 [0] 07
> [    0.000000] pcpu-alloc: [0] 08 [0] 09 [0] 10 [0] 11
> [    0.000000] Detected PIPT I-cache on CPU0
> [    0.000000] CPU features: detected: Address authentication (architected QARMA3 algorithm)
> [    0.000000] CPU features: detected: GICv3 CPU interface
> [    0.000000] CPU features: detected: HCRX_EL2 register
> [    0.000000] CPU features: detected: Virtualization Host Extensions
> [    0.000000] CPU features: detected: Memory Tagging Extension
> [    0.000000] CPU features: detected: Asymmetric MTE Tag Check Fault
> [    0.000000] CPU features: detected: Spectre-v4
> [    0.000000] CPU features: detected: Spectre-BHB
> [    0.000000] CPU features: detected: SSBS not fully self-synchronizing
> [    0.000000] Unable to handle kernel paging request at virtual address dfff800000000005
> [    0.000000] KASAN: null-ptr-deref in range [0x0000000000000028-0x000000000000002f]
> [    0.000000] Mem abort info:
> [    0.000000]   ESR = 0x0000000096000005
> [    0.000000]   EC = 0x25: DABT (current EL), IL = 32 bits
> [    0.000000]   SET = 0, FnV = 0
> [    0.000000]   EA = 0, S1PTW = 0
> [    0.000000]   FSC = 0x05: level 1 translation fault
> [    0.000000] Data abort info:
> [    0.000000]   ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
> [    0.000000]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
> [    0.000000]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> [    0.000000] [dfff800000000005] address between user and kernel address ranges
> [    0.000000] Internal error: Oops: 0000000096000005 [#1]  SMP
> [    0.000000] Modules linked in:
> [    0.000000] CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.17+unreleased-debug-arm64 #1 PREEMPTLAZY  Debian 6.17-1~exp1
> [    0.000000] pstate: 800000c9 (Nzcv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [    0.000000] pc : cpu_enable_mte+0x104/0x440
> [    0.000000] lr : cpu_enable_mte+0xf4/0x440
> [    0.000000] sp : ffff800084f67d80
> [    0.000000] x29: ffff800084f67d80 x28: 0000000000000043 x27: 0000000000000001
> [    0.000000] x26: 0000000000000001 x25: ffff800084204008 x24: ffff800084203da8
> [    0.000000] x23: ffff800084204000 x22: ffff800084203000 x21: ffff8000865a8000
> [    0.000000] x20: fffffffffffffffe x19: fffffdffddaa6a00 x18: 0000000000000011
> [    0.000000] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
> [    0.000000] x14: 0000000000000000 x13: 0000000000000001 x12: ffff700010a04829
> [    0.000000] x11: 1ffff00010a04828 x10: ffff700010a04828 x9 : dfff800000000000
> [    0.000000] x8 : ffff800085024143 x7 : 0000000000000001 x6 : ffff700010a04828
> [    0.000000] x5 : ffff800084f9d200 x4 : 0000000000000000 x3 : ffff8000800794ac
> [    0.000000] x2 : 0000000000000005 x1 : dfff800000000000 x0 : 000000000000002e
> [    0.000000] Call trace:
> [    0.000000]  cpu_enable_mte+0x104/0x440 (P)
> [    0.000000]  enable_cpu_capabilities+0x188/0x208
> [    0.000000]  setup_boot_cpu_features+0x44/0x60
> [    0.000000]  smp_prepare_boot_cpu+0x9c/0xb8
> [    0.000000]  start_kernel+0xc8/0x528
> [    0.000000]  __primary_switched+0x8c/0xa0
> [    0.000000] Code: 9100c280 d2d00001 f2fbffe1 d343fc02 (38e16841)
> [    0.000000] ---[ end trace 0000000000000000 ]---
> [    0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
> [    0.000000] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
>
> Signed-off-by: Yunseong Kim <ysk@...lloc.com>
> ---
>  arch/arm64/kernel/cpufeature.c | 26 ++++++++++++++++++++++----
>  1 file changed, 22 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index 5ed401ff79e3..a0a9fa1b376d 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -2340,6 +2340,24 @@ static void cpu_enable_mte(struct arm64_cpu_capabilities const *cap)
>
>         kasan_init_hw_tags_cpu();
>  }
> +
> +static bool has_usable_mte(const struct arm64_cpu_capabilities *entry, int scope)
> +{
> +       if (!has_cpuid_feature(entry, scope))
> +               return false;
> +
> +       /*
> +        * MTE and Generic KASAN are mutually exclusive. Generic KASAN is a
> +        * software-only mode that is incompatible with the MTE hardware.
> +        * Do not enable MTE if Generic KASAN is active.

I do not understand this. Why is Generic KASAN incompatible with MTE?
Running Generic KASAN in the kernel while having MTE enabled (and e.g.
used in userspace) seems like a valid combination.

The crash log above looks like a NULL-ptr-deref. On which line of code
does it happen?


> +        */
> +       if (IS_ENABLED(CONFIG_KASAN_GENERIC) && kasan_enabled()) {
> +               pr_warn_once("MTE capability disabled due to Generic KASAN conflict\n");
> +               return false;
> +       }
> +
> +       return true;
> +}
>  #endif /* CONFIG_ARM64_MTE */
>
>  static void user_feature_fixup(void)
> @@ -2850,7 +2868,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
>                 .desc = "Memory Tagging Extension",
>                 .capability = ARM64_MTE,
>                 .type = ARM64_CPUCAP_STRICT_BOOT_CPU_FEATURE,
> -               .matches = has_cpuid_feature,
> +               .matches = has_usable_mte,
>                 .cpu_enable = cpu_enable_mte,
>                 ARM64_CPUID_FIELDS(ID_AA64PFR1_EL1, MTE, MTE2)
>         },
> @@ -2858,21 +2876,21 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
>                 .desc = "Asymmetric MTE Tag Check Fault",
>                 .capability = ARM64_MTE_ASYMM,
>                 .type = ARM64_CPUCAP_BOOT_CPU_FEATURE,
> -               .matches = has_cpuid_feature,
> +               .matches = has_usable_mte,
>                 ARM64_CPUID_FIELDS(ID_AA64PFR1_EL1, MTE, MTE3)
>         },
>         {
>                 .desc = "FAR on MTE Tag Check Fault",
>                 .capability = ARM64_MTE_FAR,
>                 .type = ARM64_CPUCAP_SYSTEM_FEATURE,
> -               .matches = has_cpuid_feature,
> +               .matches = has_usable_mte,
>                 ARM64_CPUID_FIELDS(ID_AA64PFR2_EL1, MTEFAR, IMP)
>         },
>         {
>                 .desc = "Store Only MTE Tag Check",
>                 .capability = ARM64_MTE_STORE_ONLY,
>                 .type = ARM64_CPUCAP_BOOT_CPU_FEATURE,
> -               .matches = has_cpuid_feature,
> +               .matches = has_usable_mte,
>                 ARM64_CPUID_FIELDS(ID_AA64PFR2_EL1, MTESTOREONLY, IMP)
>         },
>  #endif /* CONFIG_ARM64_MTE */
> --
> 2.51.0
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ