lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202511171515.226076ea-lkp@intel.com>
Date: Mon, 17 Nov 2025 16:03:50 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Steven Rostedt <rostedt@...dmis.org>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
	Peter Zijlstra <peterz@...radead.org>, <linux-perf-users@...r.kernel.org>,
	<oliver.sang@...el.com>
Subject: [linus:master] [perf]  90942f9fac:
 BUG:kernel_NULL_pointer_dereference,address



Hello,

kernel test robot noticed "BUG:kernel_NULL_pointer_dereference,address" on:

commit: 90942f9fac05702065ff82ed0bade0d08168d4ea ("perf: Use current->flags & PF_KTHREAD|PF_USER_WORKER instead of current->mm == NULL")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

[test failed on      linus/master 24172e0d79900908cf5ebf366600616d29c9b417]
[test failed on linux-next/master b179ce312bafcb8c68dc718e015aee79b7939ff0]

in testcase: stress-ng
version: stress-ng-x86_64-f38a0b09a-1_20251013
with following parameters:

	nr_threads: 100%
	testtime: 60s
	test: schedmix
	cpufreq_governor: performance



config: x86_64-rhel-9.4
compiler: gcc-14
test machine: 256 threads 2 sockets Intel(R) Xeon(R) 6767P  CPU @ 2.4GHz (Granite Rapids) with 256G memory

(please refer to attached dmesg/kmsg for entire log/backtrace)


we noticed various issues happen randomly as below, however, parent keeps clean.

=========================================================================================
tbox_group/testcase/rootfs/kconfig/compiler/nr_threads/testtime/test/cpufreq_governor:
  lkp-gnr-2sp3/stress-ng/debian-13-x86_64-20250902.cgz/x86_64-rhel-9.4/gcc-14/100%/60s/schedmix/performance

153f9e74dec230f2 90942f9fac05702065ff82ed0ba
---------------- ---------------------------
       fail:runs  %reproduction    fail:runs
           |             |             |
           :50          28%          14:50    dmesg.BUG:kernel_NULL_pointer_dereference,address
           :50          32%          16:50    dmesg.Kernel_panic-not_syncing:Fatal_exception_in_interrupt
           :50          32%          16:50    dmesg.Oops
           :50           2%           1:50    dmesg.RIP:__slab_free
           :50           2%           1:50    dmesg.RIP:__update_load_avg_se
           :50           4%           2:50    dmesg.RIP:asm_sysvec_apic_timer_interrupt
           :50          12%           6:50    dmesg.RIP:finish_task_switch
           :50          32%          16:50    dmesg.RIP:gup_fast_pgd_range
           :50          16%           8:50    dmesg.RIP:osq_lock
           :50           2%           1:50    dmesg.RIP:rwsem_spin_on_owner



If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@...el.com>
| Closes: https://lore.kernel.org/oe-lkp/202511171515.226076ea-lkp@intel.com


[  296.249434][  C134] BUG: kernel NULL pointer dereference, address: 0000000000000068
[  296.249437][  C134] #PF: supervisor read access in kernel mode
[  296.249439][  C134] #PF: error_code(0x0000) - not-present page
[  296.249447][  C134] PGD 0
[  296.249449][  C134] Oops: Oops: 0000 [#1] SMP NOPTI
[  296.249452][  C134] CPU: 134 UID: 0 PID: 79798 Comm: stress-ng-sched Not tainted 6.17.0-rc1-00053-g90942f9fac05 #1 VOLUNTARY
[  296.249457][  C134] Hardware name: IEIT SYSTEMS NF5180-M8-A0-R0-00/NF5180-M8-A0-R0-00, BIOS 02.03.00 03/03/2025
[  296.249458][  C134] RIP: 0010:gup_fast_pgd_range (include/linux/pgtable.h:161 mm/gup.c:3121)
[  296.249471][  C134] Code: 22 7e 02 89 54 24 04 48 8b 90 f0 08 00 00 48 89 f8 65 4c 8b 2d b5 22 7e 02 4c 89 ac 24 80 00 00 00 49 89 cd 8b 0d 00 67 59 01 <48> 8b 52 68 48 d3 e8 25 ff 01 00 00 4c 8d 24 c2 89 f2 89 f0 83 e2
All code
========
   0:	22 7e 02             	and    0x2(%rsi),%bh
   3:	89 54 24 04          	mov    %edx,0x4(%rsp)
   7:	48 8b 90 f0 08 00 00 	mov    0x8f0(%rax),%rdx
   e:	48 89 f8             	mov    %rdi,%rax
  11:	65 4c 8b 2d b5 22 7e 	mov    %gs:0x27e22b5(%rip),%r13        # 0x27e22ce
  18:	02 
  19:	4c 89 ac 24 80 00 00 	mov    %r13,0x80(%rsp)
  20:	00 
  21:	49 89 cd             	mov    %rcx,%r13
  24:	8b 0d 00 67 59 01    	mov    0x1596700(%rip),%ecx        # 0x159672a
  2a:*	48 8b 52 68          	mov    0x68(%rdx),%rdx		<-- trapping instruction
  2e:	48 d3 e8             	shr    %cl,%rax
  31:	25 ff 01 00 00       	and    $0x1ff,%eax
  36:	4c 8d 24 c2          	lea    (%rdx,%rax,8),%r12
  3a:	89 f2                	mov    %esi,%edx
  3c:	89 f0                	mov    %esi,%eax
  3e:	83                   	.byte 0x83
  3f:	e2                   	.byte 0xe2

Code starting with the faulting instruction
===========================================
   0:	48 8b 52 68          	mov    0x68(%rdx),%rdx
   4:	48 d3 e8             	shr    %cl,%rax
   7:	25 ff 01 00 00       	and    $0x1ff,%eax
   c:	4c 8d 24 c2          	lea    (%rdx,%rax,8),%r12
  10:	89 f2                	mov    %esi,%edx
  12:	89 f0                	mov    %esi,%eax
  14:	83                   	.byte 0x83
  15:	e2                   	.byte 0xe2
[  296.249473][  C134] RSP: 0018:fffffe0001eef6d8 EFLAGS: 00010096
[  296.249498][  C134] RAX: 0000000000000000 RBX: 0000000000001000 RCX: 0000000000000030
[  296.249499][  C134] RDX: 0000000000000000 RSI: 0000000000100002 RDI: 0000000000000000
[  296.249499][  C134] RBP: 0000000000000000 R08: fffffe0001eef79c R09: 0000000000000012
[  296.249500][  C134] R10: 0000000000000005 R11: 0000000000000000 R12: fffffe0001eef820
[  296.249501][  C134] R13: fffffe0001eef820 R14: 0000000000000fff R15: 0000000000000fff
[  296.249506][  C134] FS:  0000000000000000(0000) GS:ff1100207a57e000(0000) knlGS:0000000000000000
[  296.249507][  C134] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  296.249507][  C134] CR2: 0000000000000068 CR3: 0000062d81424001 CR4: 0000000000f73ef0
[  296.249508][  C134] PKRU: 55555554
[  296.249509][  C134] Call Trace:
[  296.249510][  C134]  <NMI>
[  296.249521][  C134]  ? search_extable (lib/extable.c:118)
[  296.249526][  C134]  ? __get_user_nocheck_8 (arch/x86/lib/getuser.S:153)
[  296.249530][  C134]  ? search_exception_tables (kernel/extable.c:59)
[  296.249533][  C134]  ? fixup_exception (arch/x86/mm/extable.c:216 arch/x86/mm/extable.c:364)
[  296.249537][  C134]  gup_fast (arch/x86/include/asm/irqflags.h:158 (discriminator 1) mm/gup.c:3181 (discriminator 1))
[  296.249542][  C134]  gup_fast_fallback (mm/gup.c:3225)
[  296.249544][  C134]  perf_virt_to_phys (include/linux/mm.h:2582 kernel/events/core.c:8093)
[  296.249552][  C134]  ? __perf_event_header__init_id (kernel/events/core.c:1433 kernel/events/core.c:1445 kernel/events/core.c:7698)
[  296.249553][  C134]  perf_prepare_sample (kernel/events/core.c:8357)
[  296.249558][  C134]  perf_event_output_forward (kernel/events/core.c:8489 kernel/events/core.c:8509)
[  296.249559][  C134]  ? get_perf_callchain (kernel/events/callchain.c:247)
[  296.249561][  C134]  ? local_clock_noinstr (kernel/sched/clock.c:304 (discriminator 1))
[  296.249564][  C134]  __perf_event_overflow (kernel/events/core.c:10384 (discriminator 2))
[  296.249565][  C134]  ? setup_pebs_adaptive_sample_data (arch/x86/events/intel/ds.c:2159)
[  296.249568][  C134]  intel_pmu_drain_pebs_icl (arch/x86/events/intel/../perf_event.h:128 arch/x86/events/intel/ds.c:2367 arch/x86/events/intel/ds.c:2667)
[  296.249570][  C134]  ? exit_mmap (include/linux/sched.h:2086 mm/mmap.c:1307)
[  296.249572][  C134]  ? exit_mmap (include/linux/sched.h:2086 mm/mmap.c:1307)
[  296.249573][  C134]  ? nmi_restore (arch/x86/entry/entry_64.S:1467)
[  296.249583][  C134]  handle_pmi_common (arch/x86/events/intel/core.c:3205)
[  296.249598][  C134]  ? nmi_restore (arch/x86/entry/entry_64.S:1467)
[  296.249600][  C134]  ? flush_tlb_one_kernel (arch/x86/include/asm/paravirt.h:85 arch/x86/mm/tlb.c:1630 arch/x86/mm/tlb.c:1585)
[  296.249602][  C134]  ? native_set_fixmap (arch/x86/mm/pgtable.c:582 arch/x86/mm/pgtable.c:591)
[  296.249604][  C134]  ? ghes_copy_tofrom_phys (drivers/acpi/apei/ghes.c:345)
[  296.249607][  C134]  intel_pmu_handle_irq (arch/x86/include/asm/msr.h:70 arch/x86/include/asm/msr.h:108 arch/x86/events/intel/core.c:2505 arch/x86/events/intel/core.c:3349)
[  296.249609][  C134]  perf_event_nmi_handler (arch/x86/events/core.c:1767 arch/x86/events/core.c:1753)
[  296.249614][  C134]  nmi_handle (arch/x86/kernel/nmi.c:162 arch/x86/kernel/nmi.c:130)
[  296.249618][  C134]  default_do_nmi (arch/x86/kernel/nmi.c:393 (discriminator 61))
[  296.249627][  C134]  exc_nmi (arch/x86/kernel/nmi.c:588)
[  296.249629][  C134]  end_repeat_nmi (arch/x86/entry/entry_64.S:1409)
[  296.249634][  C134] RIP: 0010:__slab_free (mm/slub.c:495 (discriminator 1) mm/slub.c:567 (discriminator 1) mm/slub.c:4509 (discriminator 1))
[  296.249637][  C134] Code: 0f 84 62 01 00 00 49 8b 95 b8 00 00 00 48 89 c1 41 89 da 48 89 5c 24 58 48 0f c9 66 44 2b 54 24 2c 4c 31 f2 66 44 89 54 24 58 <48> 31 ca 48 89 10 89 d8 c1 e8 1f 4d 85 f6 41 89 c1 41 0f 94 c0 66
All code
========
   0:	0f 84 62 01 00 00    	je     0x168
   6:	49 8b 95 b8 00 00 00 	mov    0xb8(%r13),%rdx
   d:	48 89 c1             	mov    %rax,%rcx
  10:	41 89 da             	mov    %ebx,%r10d
  13:	48 89 5c 24 58       	mov    %rbx,0x58(%rsp)
  18:	48 0f c9             	bswap  %rcx
  1b:	66 44 2b 54 24 2c    	sub    0x2c(%rsp),%r10w
  21:	4c 31 f2             	xor    %r14,%rdx
  24:	66 44 89 54 24 58    	mov    %r10w,0x58(%rsp)
  2a:*	48 31 ca             	xor    %rcx,%rdx		<-- trapping instruction
  2d:	48 89 10             	mov    %rdx,(%rax)
  30:	89 d8                	mov    %ebx,%eax
  32:	c1 e8 1f             	shr    $0x1f,%eax
  35:	4d 85 f6             	test   %r14,%r14
  38:	41 89 c1             	mov    %eax,%r9d
  3b:	41 0f 94 c0          	sete   %r8b
  3f:	66                   	data16

Code starting with the faulting instruction
===========================================
   0:	48 31 ca             	xor    %rcx,%rdx
   3:	48 89 10             	mov    %rdx,(%rax)
   6:	89 d8                	mov    %ebx,%eax
   8:	c1 e8 1f             	shr    $0x1f,%eax
   b:	4d 85 f6             	test   %r14,%r14
   e:	41 89 c1             	mov    %eax,%r9d
  11:	41 0f 94 c0          	sete   %r8b
  15:	66                   	data16


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20251117/202511171515.226076ea-lkp@intel.com



-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ