lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <202509192101.5d4e0282-lkp@intel.com>
Date: Fri, 19 Sep 2025 21:56:38 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Yury Norov <yury.norov@...il.com>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
	Thomas Gleixner <tglx@...utronix.de>, Yury Norov <yury.norov@...il.com>,
	Rasmus Villemoes <linux@...musvillemoes.dk>, <oliver.sang@...el.com>
Subject: Re: [PATCH 3/3] group_cpus: optimize grp_spread_init_one()



Hello,

kernel test robot noticed "BUG:kernel_NULL_pointer_dereference,address" on:

commit: 5df2998b7baa3fd4cb66272d1ea5625573b4a63f ("[PATCH 3/3] group_cpus: optimize grp_spread_init_one()")
url: https://github.com/intel-lab-lkp/linux/commits/Yury-Norov/bitmap-cpumask-introduce-and_andnot-search-helper-and-iterator/20250911-051023
base: https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git 8ad25ebfa70e86860559b306bbc923c7db4fcac6
patch link: https://lore.kernel.org/all/20250910210850.404834-4-yury.norov@gmail.com/
patch subject: [PATCH 3/3] group_cpus: optimize grp_spread_init_one()

in testcase: stress-ng
version: stress-ng-x86_64-665b4465f-1_20250912
with following parameters:

	nr_threads: 100%
	testtime: 60s
	test: cpu-online
	cpufreq_governor: performance



config: x86_64-rhel-9.4
compiler: gcc-14
test machine: 104 threads 2 sockets (Skylake) with 192G memory

(please refer to attached dmesg/kmsg for entire log/backtrace)


If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@...el.com>
| Closes: https://lore.kernel.org/oe-lkp/202509192101.5d4e0282-lkp@intel.com



The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250919/202509192101.5d4e0282-lkp@intel.com [1]


we observed the issue happens randomly in our tests.

=========================================================================================
tbox_group/testcase/rootfs/kconfig/compiler/nr_threads/testtime/test/cpufreq_governor:
  lkp-skl-fpga01/stress-ng/debian-13-x86_64-20250902.cgz/x86_64-rhel-9.4/gcc-14/100%/60s/cpu-online/performance

commit:
  6ade57f62d272 ("group_cpus: don't call cpumask_weight() prematurely")
  5df2998b7baa3 ("group_cpus: optimize grp_spread_init_one()")

6ade57f62d272d3e 5df2998b7baa3fd4cb66272d1ea
---------------- ---------------------------
       fail:runs  %reproduction    fail:runs
           |             |             |
           :30          43%          13:30    dmesg.BUG:kernel_NULL_pointer_dereference,address
           :30          43%          13:30    dmesg.Kernel_panic-not_syncing:Fatal_exception
           :30          43%          13:30    dmesg.Oops
           :30          43%          13:30    dmesg.RIP:blk_mq_all_tag_iter

and sorry our bot failed to upload correct dmesg in above link [1]
attached one FYI.


[   43.544050][  T130] BUG: kernel NULL pointer dereference, address: 0000000000000004
[   43.552056][  T130] #PF: supervisor read access in kernel mode
[   43.558218][  T130] #PF: error_code(0x0000) - not-present page
[   43.564340][  T130] PGD 0 P4D 0 
[   43.567866][  T130] Oops: Oops: 0000 [#1] SMP PTI
[   43.572855][  T130] CPU: 19 UID: 0 PID: 130 Comm: cpuhp/19 Not tainted 6.17.0-rc1-00014-g5df2998b7baa #1 VOLUNTARY 
[   43.583575][  T130] Hardware name: Intel Corporation S2600BT/S2600BT, BIOS SE5C620.86B.1D.01.0147.121320181755 12/13/2018
[   43.594817][  T130] RIP: 0010:blk_mq_all_tag_iter+0x1a/0x230
[   43.600779][  T130] Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 66 0f 1f 00 0f 1f 44 00 00 41 57 41 56 41 55 41 54 49 89 fc 55 53 48 83 ec 50 <8b> 4f 04 48 89 34 24 48 89 54 24 08 65 48 8b 05 8a b4 48 02 48 89
[   43.620822][  T130] RSP: 0018:ffffc9000ce5fda0 EFLAGS: 00010286
[   43.627048][  T130] RAX: 0000604fe9627758 RBX: ffffc9000ce5fe28 RCX: 0000000000000068
[   43.635176][  T130] RDX: ffffc9000ce5fe28 RSI: ffffffff8199e310 RDI: 0000000000000000
[   43.643294][  T130] RBP: ffff88810dd98600 R08: ffffffff833d72a0 R09: ffff8881003c8710
[   43.651410][  T130] R10: ffffc9000ce5fdb0 R11: ffffc9000ce5fdb8 R12: 0000000000000000
[   43.659516][  T130] R13: 0000000000000013 R14: ffff88810dd98760 R15: ffff8897e08dbde8
[   43.667621][  T130] FS:  0000000000000000(0000) GS:ffff88985caa3000(0000) knlGS:0000000000000000
[   43.676681][  T130] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   43.683393][  T130] CR2: 0000000000000004 CR3: 000000303de24004 CR4: 00000000007726f0
[   43.691491][  T130] PKRU: 55555554
[   43.695175][  T130] Call Trace:
[   43.698593][  T130]  <TASK>
[   43.701657][  T130]  ? __call_rcu_common+0xb0/0x2f0
[   43.707850][  T130]  ? xas_load+0xd/0xf0
[   43.712054][  T130]  ? xa_load+0x58/0xb0
[   43.716259][  T130]  blk_mq_hctx_notify_offline+0xd1/0x170
[   43.722025][  T130]  ? __pfx_blk_mq_hctx_notify_offline+0x10/0x10
[   43.728390][  T130]  cpuhp_invoke_callback+0x1c7/0x370
[   43.733805][  T130]  ? __pfx_smpboot_thread_fn+0x10/0x10
[   43.739386][  T130]  cpuhp_thread_fun+0x98/0x170
[   43.744273][  T130]  smpboot_thread_fn+0xc8/0x1b0
[   43.749254][  T130]  kthread+0xe5/0x1f0
[   43.753363][  T130]  ? __pfx_kthread+0x10/0x10
[   43.758082][  T130]  ret_from_fork+0x132/0x170
[   43.762799][  T130]  ? __pfx_kthread+0x10/0x10
[   43.767513][  T130]  ret_from_fork_asm+0x1a/0x30
[   43.772398][  T130]  </TASK>
[   43.775532][  T130] Modules linked in: intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common btrfs blake2b_generic xor raid6_pq skx_edac skx_edac_common nfit libnvdimm sd_mod sg x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel irdma kvm irqbypass ice snd_pcm ghash_clmulni_intel ahci snd_timer rapl gnss snd ast ib_uverbs nvme intel_cstate libahci ipmi_ssif drm_client_lib soundcore binfmt_misc acpi_power_meter mei_me drm_shmem_helper i2c_i801 ib_core ioatdma pcspkr nvme_core libata ipmi_si intel_uncore acpi_ipmi drm_kms_helper mei i2c_smbus intel_pch_thermal lpc_ich dca wmi ipmi_devintf ipmi_msghandler acpi_pad joydev drm fuse nfnetlink
[   43.836729][  T130] CR2: 0000000000000004
[   43.841005][  T130] ---[ end trace 0000000000000000 ]---
[   43.855607][  T130] pstore: backend (erst) writing error (-28)

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


Download attachment "dmesg.xz" of type "application/x-xz" (30376 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ