lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aFQaY4Bxle8-GT6O@harry>
Date: Thu, 19 Jun 2025 23:10:43 +0900
From: Harry Yoo <harry.yoo@...cle.com>
To: kernel test robot <oliver.sang@...el.com>
Cc: Uladzislau Rezki <urezki@...il.com>, oe-lkp@...ts.linux.dev, lkp@...el.com,
        linux-kernel@...r.kernel.org,
        Andrew Morton <akpm@...ux-foundation.org>, Baoquan He <bhe@...hat.com>,
        Adrian Huang <ahuang12@...ovo.com>,
        Christop Hellwig <hch@...radead.org>,
        Mateusz Guzik <mjguzik@...il.com>, linux-mm@...ck.org,
        Suren Baghdasaryan <surenb@...gle.com>,
        Kent Overstreet <kent.overstreet@...ux.dev>
Subject: Kernel crash due to alloc_tag_top_users() being called when
 !mem_profiling_support?

On Wed, Jun 18, 2025 at 02:25:37PM +0800, kernel test robot wrote:
> 
> Hello,
> 
> for this change, we reported
> "[linux-next:master] [lib/test_vmalloc.c]  7fc85b92db: Mem-Info"
> in
> https://lore.kernel.org/all/202505071555.e757f1e0-lkp@intel.com/
> 
> at that time, we made some tests with x86_64 config which runs well.
> 
> now we noticed the commit is in mainline now.

(Re-sending due to not Ccing people and the list...)

Hi, I'm facing the same error on my testing environment.

I think this is related to memory allocation profiling & code tagging
subsystems rather than vmalloc, so let's add related folks to Cc.

After a quick skimming of the code, it seems the condition
to trigger this is that on 1) MEM_ALLOC_PROFILING is compiled but
2) not enabled by default. and 3) allocation somehow failed, calling
alloc_tag_top_users().

I see "Memory allocation profiling is not supported!" in the dmesg,
which means it did not alloc & inititialize alloc_tag_cttype properly,
but alloc_tag_top_users() tries to acquire the semaphore.

I think the kernel should not call alloc_tag_top_users() at all (or it
should return an error) if mem_profiling_support == false?

Does the following work on your testing environment?

(Only did very light testing on my QEMU, but seems to fix the issue for me.)

diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c
index d48b80f3f007..57d4d5673855 100644
--- a/lib/alloc_tag.c
+++ b/lib/alloc_tag.c
@@ -134,7 +134,9 @@ size_t alloc_tag_top_users(struct codetag_bytes *tags, size_t count, bool can_sl
 	struct codetag_bytes n;
 	unsigned int i, nr = 0;
 
-	if (can_sleep)
+	if (!mem_profiling_support)
+		return 0;
+	else if (can_sleep)
 		codetag_lock_module_list(alloc_tag_cttype, true);
 	else if (!codetag_trylock_module_list(alloc_tag_cttype))
 		return 0;

> the config still has expected diff with parent:
> 
> --- /pkg/linux/x86_64-randconfig-161-20250614/gcc-12/7a73348e5d4715b5565a53f21c01ea7b54e46cbd/.config   2025-06-17 14:40:29.481052101 +0800
> +++ /pkg/linux/x86_64-randconfig-161-20250614/gcc-12/2d76e79315e403aab595d4c8830b7a46c19f0f3b/.config   2025-06-17 14:41:18.448543738 +0800
> @@ -7551,7 +7551,7 @@ CONFIG_TEST_IDA=m
>  CONFIG_TEST_MISC_MINOR=m
>  # CONFIG_TEST_LKM is not set
>  CONFIG_TEST_BITOPS=m
> -CONFIG_TEST_VMALLOC=m
> +CONFIG_TEST_VMALLOC=y
>  # CONFIG_TEST_BPF is not set
>  CONFIG_FIND_BIT_BENCHMARK=m
>  # CONFIG_TEST_FIRMWARE is not set
> 
> 
> then we noticed similar random issue with x86_64 randconfig this time.
> 
> 7a73348e5d4715b5 2d76e79315e403aab595d4c8830
> ---------------- ---------------------------
>        fail:runs  %reproduction    fail:runs
>            |             |             |
>            :199         34%          67:200   dmesg.KASAN:null-ptr-deref_in_range[#-#]
>            :199         34%          67:200   dmesg.Kernel_panic-not_syncing:Fatal_exception
>            :199         34%          67:200   dmesg.Mem-Info
>            :199         34%          67:200   dmesg.Oops:general_protection_fault,probably_for_non-canonical_address#:#[##]SMP_KASAN
>            :199         34%          67:200   dmesg.RIP:down_read_trylock
> 
> we don't have enough knowledge to understand the relationship between code
> change and the random issues. just report what we obsverved in our tests FYI.
> 
> below is full report.
> 
> 
> 
> kernel test robot noticed "Kernel_panic-not_syncing:Fatal_exception" on:
> 
> commit: 2d76e79315e403aab595d4c8830b7a46c19f0f3b ("lib/test_vmalloc.c: allow built-in execution")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> 
> [test failed on linus/master      e04c78d86a9699d136910cfc0bdcf01087e3267e]
> [test failed on linux-next/master 050f8ad7b58d9079455af171ac279c4b9b828c11]
> 
> in testcase: boot
> 
> config: x86_64-randconfig-161-20250614
> compiler: gcc-12
> test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
> 
> (please refer to attached dmesg/kmsg for entire log/backtrace)
> 
> 
> 
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <oliver.sang@...el.com>
> | Closes: https://lore.kernel.org/oe-lkp/202506181351.bba867dd-lkp@intel.com
> 
> 
> [   36.902716][   T60] vmalloc_node_range for size 8192 failed: Address range restricted to 0xffffc90000000000 - 0xffffe8ffffffffff
> [   36.903981][   T60] vmalloc_test/0: vmalloc error: size 4096, vm_struct allocation failed, mode:0xdc0(GFP_KERNEL|__GFP_ZERO), nodemask=(null)
> [   36.905195][   T60] CPU: 1 UID: 0 PID: 60 Comm: vmalloc_test/0 Not tainted 6.15.0-rc6-00142-g2d76e79315e4 #1 VOLUNTARY 
> [   36.905201][   T60] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> [   36.905203][   T60] Call Trace:
> [   36.905206][   T60]  <TASK>
> [   36.905209][   T60]  dump_stack_lvl+0x87/0xd6
> [   36.905223][   T60]  warn_alloc+0x15e/0x291
> [   36.905230][   T60]  ? has_managed_dma+0x37/0x37
> [   36.905237][   T60]  ? __get_vm_area_node+0x33a/0x3c0
> [   36.905244][   T60]  ? __get_vm_area_node+0x33a/0x3c0
> [   36.905250][   T60]  __vmalloc_node_range_noprof+0x170/0x306
> [   36.905255][   T60]  ? __vmalloc_area_node+0x460/0x460
> [   36.905260][   T60]  ? test_func+0x2ae/0x469
> [   36.905264][   T60]  __vmalloc_node_noprof+0xb8/0xd9
> [   36.905267][   T60]  ? test_func+0x2ae/0x469
> [   36.905272][   T60]  align_shift_alloc_test+0xa8/0x165
> [   36.905277][   T60]  test_func+0x2ae/0x469
> [   36.905281][   T60]  ? pcpu_alloc_test+0x31b/0x31b
> [   36.905286][   T60]  ? __kthread_parkme+0xcb/0x1a3
> [   36.905293][   T60]  ? pcpu_alloc_test+0x31b/0x31b
> [   36.905297][   T60]  kthread+0x452/0x464
> [   36.905301][   T60]  ? kthread_is_per_cpu+0x51/0x51
> [   36.905304][   T60]  ? _raw_spin_unlock_irq+0x23/0x35
> [   36.905308][   T60]  ? kthread_is_per_cpu+0x51/0x51
> [ 36.905311][ T60] ? kthread_is_per_cpu (kbuild/obj/consumer/x86_64-randconfig-161-20250614/kernel/kthread.c:413) 
> [ 36.905314][ T60] ret_from_fork (kbuild/obj/consumer/x86_64-randconfig-161-20250614/arch/x86/kernel/process.c:153) 
> [ 36.905318][ T60] ? kthread_is_per_cpu (kbuild/obj/consumer/x86_64-randconfig-161-20250614/kernel/kthread.c:413) 
> [ 36.905321][ T60] ret_from_fork_asm (kbuild/obj/consumer/x86_64-randconfig-161-20250614/arch/x86/entry/entry_64.S:255) 
> [   36.905330][   T60]  </TASK>
> [   36.905332][   T60] Mem-Info:
> [   36.919941][   T60] active_anon:0 inactive_anon:0 isolated_anon:0
> [   36.919941][   T60]  active_file:0 inactive_file:0 isolated_file:0
> [   36.919941][   T60]  unevictable:41612 dirty:0 writeback:0
> [   36.919941][   T60]  slab_reclaimable:7429 slab_unreclaimable:145259
> [   36.919941][   T60]  mapped:0 shmem:0 pagetables:145
> [   36.919941][   T60]  sec_pagetables:0 bounce:0
> [   36.919941][   T60]  kernel_misc_reclaimable:0
> [   36.919941][   T60]  free:3233392 free_pcp:1185 free_cma:0
> [   36.923830][   T60] Node 0 active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:166448kB isolated(anon):0kB isolated(file):0kB mapped:0kB dirty:0kB writeback:0kB shmem:0kB writeback_tmp:0kB kernel_stack:1952kB pagetables:580kB sec_pagetables:0kB all_unreclaimable? no Balloon:0kB
> [   36.926265][   T60] DMA free:15360kB boost:0kB min:16kB low:28kB high:40kB reserved_highatomic:0KB free_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15992kB managed:15360kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> [   36.928855][   T60] lowmem_reserve[]: 0 2991 13741 13741
> [   36.929411][   T60] DMA32 free:3060560kB boost:0kB min:3224kB low:6244kB high:9264kB reserved_highatomic:0KB free_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:3129216kB managed:3063680kB mlocked:0kB bounce:0kB free_pcp:3120kB local_pcp:3120kB free_cma:0kB
> [   36.932080][   T60] lowmem_reserve[]: 0 0 10749 10749
> [   36.932604][   T60] Normal free:9857648kB boost:0kB min:11744kB low:22748kB high:33752kB reserved_highatomic:0KB free_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:166448kB writepending:0kB present:13631488kB managed:11007884kB mlocked:0kB bounce:0kB free_pcp:1620kB local_pcp:740kB free_cma:0kB
> [   36.935336][   T60] lowmem_reserve[]: 0 0 0 0
> [   36.935802][   T60] DMA: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 1*1024kB (U) 1*2048kB (U) 3*4096kB (M) = 15360kB
> [   36.936931][   T60] DMA32: 0*4kB 0*8kB 1*16kB (M) 2*32kB (M) 2*64kB (M) 1*128kB (M) 2*256kB (M) 2*512kB (M) 1*1024kB (M) 1*2048kB (M) 746*4096kB (M) = 3060560kB
> [   36.938318][   T60] Normal: 6*4kB (ME) 2*8kB (ME) 7*16kB (UME) 5*32kB (M) 3*64kB (ME) 4*128kB (M) 6*256kB (UME) 2*512kB (M) 1*1024kB (M) 3*2048kB (UME) 2404*4096kB (M) = 9857528kB
> [   36.939849][   T60] 41618 total pagecache pages
> [   36.940324][   T60] 4194174 pages RAM
> [   36.940721][   T60] 0 pages HighMem/MovableOnly
> [   36.941188][   T60] 672443 pages reserved
> [   36.941626][   T60] Oops: general protection fault, probably for non-canonical address 0xdffffc000000001b: 0000 [#1] SMP KASAN
> [   36.942185][   T60] KASAN: null-ptr-deref in range [0x00000000000000d8-0x00000000000000df]
> [   36.942185][   T60] CPU: 1 UID: 0 PID: 60 Comm: vmalloc_test/0 Not tainted 6.15.0-rc6-00142-g2d76e79315e4 #1 VOLUNTARY 
> [   36.942185][   T60] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> [   36.942185][   T60] RIP: 0010:down_read_trylock+0xa7/0x2b9
> [   36.942185][   T60] Code: b0 ef 25 91 e8 57 16 40 00 83 3d 9c e6 a7 09 00 0f 85 2c 01 00 00 48 8d 6b 68 b8 ff ff 37 00 48 89 ea 48 c1 e0 2a 48 c1 ea 03 <80> 3c 02 00 74 08 48 89 ef e8 3c 16 40 00 48 3b 5b 68 0f 84 00 01
> [   36.942185][   T60] RSP: 0000:ffff88814657f848 EFLAGS: 00010206
> [   36.942185][   T60] RAX: dffffc0000000000 RBX: 0000000000000070 RCX: 1ffffffff224bdf6
> [   36.942185][   T60] RDX: 000000000000001b RSI: 000000000000000a RDI: 0000000000000070
> [   36.942185][   T60] RBP: 00000000000000d8 R08: 0000000000000000 R09: 0000000000000000
> [   36.942185][   T60] R10: 0000000000000000 R11: 0000000000000000 R12: 1ffff11028caff0a
> [   36.942185][   T60] R13: ffff88814657fa30 R14: dffffc0000000000 R15: 0000000000000000
> [   36.942185][   T60] FS:  0000000000000000(0000) GS:ffff88841c1f0000(0000) knlGS:0000000000000000
> [   36.942185][   T60] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   36.942185][   T60] CR2: 0000000000000000 CR3: 00000001636e0000 CR4: 00000000000406b0
> [   36.942185][   T60] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   36.942185][   T60] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [   36.942185][   T60] Call Trace:
> [   36.942185][   T60]  <TASK>
> [   36.942185][   T60]  ? clear_nonspinnable+0x32/0x32
> [   36.942185][   T60]  ? vprintk_emit+0x165/0x194
> [   36.942185][   T60]  codetag_trylock_module_list+0xd/0x19
> [   36.942185][   T60]  alloc_tag_top_users+0x95/0x216
> [   36.942185][   T60]  ? _printk+0xad/0xdf
> [   36.942185][   T60]  ? reserve_module_tags+0x308/0x308
> [   36.942185][   T60]  __show_mem+0x167/0x54b
> [   36.942185][   T60]  ? _printk+0xad/0xdf
> [   36.942185][   T60]  ? printk_get_console_flush_type+0x272/0x272
> [   36.942185][   T60]  ? show_free_areas+0x115d/0x115d
> [   36.942185][   T60]  ? tracer_hardirqs_on+0x1b/0x28d
> [   36.942185][   T60]  ? dump_stack_lvl+0x91/0xd6
> [   36.942185][   T60]  ? warn_alloc+0x251/0x291
> [   36.942185][   T60]  warn_alloc+0x251/0x291
> [   36.942185][   T60]  ? has_managed_dma+0x37/0x37
> [   36.942185][   T60]  ? __get_vm_area_node+0x33a/0x3c0
> [   36.942185][   T60]  __vmalloc_node_range_noprof+0x170/0x306
> [   36.942185][   T60]  ? __vmalloc_area_node+0x460/0x460
> [   36.942185][   T60]  ? test_func+0x2ae/0x469
> [   36.942185][   T60]  __vmalloc_node_noprof+0xb8/0xd9
> [   36.942185][   T60]  ? test_func+0x2ae/0x469
> [   36.942185][   T60]  align_shift_alloc_test+0xa8/0x165
> [   36.942185][   T60]  test_func+0x2ae/0x469
> [   36.942185][   T60]  ? pcpu_alloc_test+0x31b/0x31b
> [   36.942185][   T60]  ? __kthread_parkme+0xcb/0x1a3
> [   36.942185][   T60]  ? pcpu_alloc_test+0x31b/0x31b
> [   36.942185][   T60]  kthread+0x452/0x464
> [   36.942185][   T60]  ? kthread_is_per_cpu+0x51/0x51
> [   36.942185][   T60]  ? _raw_spin_unlock_irq+0x23/0x35
> [   36.942185][   T60]  ? kthread_is_per_cpu+0x51/0x51
> [   36.942185][   T60]  ret_from_fork+0x20/0x54
> [   36.942185][   T60]  ? kthread_is_per_cpu+0x51/0x51
> [   36.942185][   T60]  ret_from_fork_asm+0x11/0x20
> [   36.942185][   T60]  </TASK>
> [   36.942185][   T60] Modules linked in:
> [   37.000652][   T60] ---[ end trace 0000000000000000 ]---
> [   37.001188][   T60] RIP: 0010:down_read_trylock+0xa7/0x2b9
> [   37.001731][   T60] Code: b0 ef 25 91 e8 57 16 40 00 83 3d 9c e6 a7 09 00 0f 85 2c 01 00 00 48 8d 6b 68 b8 ff ff 37 00 48 89 ea 48 c1 e0 2a 48 c1 ea 03 <80> 3c 02 00 74 08 48 89 ef e8 3c 16 40 00 48 3b 5b 68 0f 84 00 01
> [   37.003488][   T60] RSP: 0000:ffff88814657f848 EFLAGS: 00010206
> [   37.004072][   T60] RAX: dffffc0000000000 RBX: 0000000000000070 RCX: 1ffffffff224bdf6
> [   37.004848][   T60] RDX: 000000000000001b RSI: 000000000000000a RDI: 0000000000000070
> [   37.005610][   T60] RBP: 00000000000000d8 R08: 0000000000000000 R09: 0000000000000000
> [   37.006381][   T60] R10: 0000000000000000 R11: 0000000000000000 R12: 1ffff11028caff0a
> [   37.007178][   T60] R13: ffff88814657fa30 R14: dffffc0000000000 R15: 0000000000000000
> [   37.007940][   T60] FS:  0000000000000000(0000) GS:ffff88841c1f0000(0000) knlGS:0000000000000000
> [   37.008792][   T60] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   37.009411][   T60] CR2: 0000000000000000 CR3: 00000001636e0000 CR4: 00000000000406b0
> [   37.010175][   T60] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   37.010950][   T60] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [   37.011716][   T60] Kernel panic - not syncing: Fatal exception
> [   37.012397][   T60] Kernel Offset: 0x6200000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> 
> 
> The kernel config and materials to reproduce are available at:
> https://download.01.org/0day-ci/archive/20250618/202506181351.bba867dd-lkp@intel.com
> 
> 
> 
> -- 
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki
> 
> 

-- 
Cheers,
Harry / Hyeonggon

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ