lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <2b8e3ca5-1645-489c-9d7f-dd13e5fc43ed@kzalloc.com>
Date: Thu, 9 Oct 2025 08:10:53 +0900
From: Yunseong Kim <ysk@...lloc.com>
To: Catalin Marinas <catalin.marinas@....com>,
 James Morse <james.morse@....com>, Will Deacon <will@...nel.org>,
 Yeoreum Yun <yeoreum.yun@....com>,
 Vincenzo Frascino <vincenzo.frascino@....com>
Cc: Andrey Konovalov <andreyknvl@...il.com>, Marc Zyngier <maz@...nel.org>,
 Mark Brown <broonie@...nel.org>, Oliver Upton <oliver.upton@...ux.dev>,
 Ard Biesheuvel <ardb@...nel.org>, Andrey Ryabinin <ryabinin.a.a@...il.com>,
 Alexander Potapenko <glider@...gle.com>, Dmitry Vyukov <dvyukov@...gle.com>,
 linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
 kasan-dev@...glegroups.com
Subject: Re: [PATCH] arm64: cpufeature: Don't cpu_enable_mte() when
 KASAN_GENERIC is active

To summarize my situation, I thought the boot panic issue might be due
to incompatibility between MTE and KASAN Generic, so I sent this patch.

However, it seems that the problem is related to the call path involving
ZERO page. Also, I am curious how it works correctly in other machine.

On 10/9/25 7:28 AM, Yunseong Kim wrote:
> Hi Andrey,
> 
> On 10/9/25 6:36 AM, Andrey Konovalov wrote:
>> On Wed, Oct 8, 2025 at 11:13 PM Yunseong Kim <ysk@...lloc.com> wrote:
>>> [...]
>> I do not understand this. Why is Generic KASAN incompatible with MTE?
> 
> My board wouldn't boot on the debian debug kernel, so I enabled
> earlycon=pl011,0x40d0000 and checked via the UART console.
> 
>> Running Generic KASAN in the kernel while having MTE enabled (and e.g.
>> used in userspace) seems like a valid combination.
> 
> Then it must be caused by something else. Thank you for letting me know.
> 
> It seems to be occurring in the call path as follows:
> 
> cpu_enable_mte()
>  -> try_page_mte_tagging(ZERO_PAGE(0))
>    -> VM_WARN_ON_ONCE(folio_test_hugetlb(page_folio(page)));
> 
>  https://elixir.bootlin.com/linux/v6.17/source/arch/arm64/include/asm/mte.h#L83

 -> page_folio(ZERO_PAGE(0))
  -> (struct folio *)_compound_head(ZERO_PAGE(0))

 https://elixir.bootlin.com/linux/v6.17/source/include/linux/page-flags.h#L307

>> The crash log above looks like a NULL-ptr-deref. On which line of code
>> does it happen?
> 
> Decoded stack trace here:
> 
> [    0.000000] Unable to handle kernel paging request at virtual address dfff800000000005
> [    0.000000] KASAN: null-ptr-deref in range [0x0000000000000028-0x000000000000002f]
> [    0.000000] Mem abort info:
> [    0.000000]   ESR = 0x0000000096000005
> [    0.000000]   EC = 0x25: DABT (current EL), IL = 32 bits
> [    0.000000]   SET = 0, FnV = 0
> [    0.000000]   EA = 0, S1PTW = 0
> [    0.000000]   FSC = 0x05: level 1 translation fault
> [    0.000000] Data abort info:
> [    0.000000]   ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
> [    0.000000]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
> [    0.000000]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> [    0.000000] [dfff800000000005] address between user and kernel address ranges
> [    0.000000] Internal error: Oops: 0000000096000005 [#1]  SMP
> [    0.000000] Modules linked in:
> [    0.000000] CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.17+unreleased-debug-arm64 #1 PREEMPTLAZY  Debian 6.17-1~exp1
> [    0.000000] pstate: 800000c9 (Nzcv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [    0.000000] pc : cpu_enable_mte (debian/build/build_arm64_none_debug-arm64/include/linux/page-flags.h:1065 (discriminator 1) debian/build/build_arm64_none_debug-arm64/arch/arm64/include/asm/mte.h:83 (discriminator 1) debian/build/build_arm64_none_debug-arm64/arch/arm64/kernel/cpufeature.c:2419 (discriminator 1))
> [    0.000000] lr : cpu_enable_mte (debian/build/build_arm64_none_debug-arm64/include/linux/page-flags.h:1065 (discriminator 1) debian/build/build_arm64_none_debug-arm64/arch/arm64/include/asm/mte.h:83 (discriminator 1) debian/build/build_arm64_none_debug-arm64/arch/arm64/kernel/cpufeature.c:2419 (discriminator 1))
> [    0.000000] sp : ffff800084f67d80
> [    0.000000] x29: ffff800084f67d80 x28: 0000000000000043 x27: 0000000000000001
> [    0.000000] x26: 0000000000000001 x25: ffff800084204008 x24: ffff800084203da8
> [    0.000000] x23: ffff800084204000 x22: ffff800084203000 x21: ffff8000865a8000
> [    0.000000] x20: fffffffffffffffe x19: fffffdffddaa6a00 x18: 0000000000000011
> [    0.000000] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
> [    0.000000] x14: 0000000000000000 x13: 0000000000000001 x12: ffff700010a04829
> [    0.000000] x11: 1ffff00010a04828 x10: ffff700010a04828 x9 : dfff800000000000
> [    0.000000] x8 : ffff800085024143 x7 : 0000000000000001 x6 : ffff700010a04828
> [    0.000000] x5 : ffff800084f9d200 x4 : 0000000000000000 x3 : ffff8000800794ac
> [    0.000000] x2 : 0000000000000005 x1 : dfff800000000000 x0 : 000000000000002e
> [    0.000000] Call trace:
> [    0.000000]  cpu_enable_mte (debian/build/build_arm64_none_debug-arm64/√ (discriminator 1) debian/build/build_arm64_none_debug-arm64/arch/arm64/include/asm/mte.h:83 (discriminator 1) debian/build/build_arm64_none_debug-arm64/arch/arm64/kernel/cpufeature.c:2419 (discriminator 1)) (P)
> [    0.000000]  enable_cpu_capabilities (debian/build/build_arm64_none_debug-arm64/arch/arm64/kernel/cpufeature.c:3561 (discriminator 2))
> [    0.000000]  setup_boot_cpu_features (debian/build/build_arm64_none_debug-arm64/arch/arm64/kernel/cpufeature.c:3888 debian/build/build_arm64_none_debug-arm64/arch/arm64/kernel/cpufeature.c:3906)
> [    0.000000]  smp_prepare_boot_cpu (debian/build/build_arm64_none_debug-arm64/arch/arm64/kernel/smp.c:466)
> [    0.000000]  start_kernel (debian/build/build_arm64_none_debug-arm64/init/main.c:929)
> [    0.000000]  __primary_switched (debian/build/build_arm64_none_debug-arm64/arch/arm64/kernel/head.S:247)
> [    0.000000] Code: 9100c280 d2d00001 f2fbffe1 d343fc02 (38e16841)
> All code
> ========
>    0:	9100c280 	add	x0, x20, #0x30
>    4:	d2d00001 	mov	x1, #0x800000000000        	// #140737488355328
>    8:	f2fbffe1 	movk	x1, #0xdfff, lsl #48
>    c:	d343fc02 	lsr	x2, x0, #3
>   10:*	38e16841 	ldrsb	w1, [x2, x1]		<-- trapping instruction
> 
> Code starting with the faulting instruction
> ===========================================
>    0:	38e16841 	ldrsb	w1, [x2, x1]
> [    0.000000] ---[ end trace 0000000000000000 ]---
> [    0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
> [    0.000000] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
> 
> 
> If there are any other points you'd like me to check or directions, please
> let me know.
> 
> Thank you!
> 
> Yunseong

Best regards,
Yunseong


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ