lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 19 Jan 2024 10:41:16 +0800
From: kernel test robot <oliver.sang@...el.com>
To: <namhyung@...nel.org>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, Ian Rogers
	<irogers@...gle.com>, Kan Liang <kan.liang@...ux.intel.com>,
	<linux-perf-users@...r.kernel.org>, <linux-kernel@...r.kernel.org>, "Peter
 Zijlstra" <peterz@...radead.org>, Ingo Molnar <mingo@...nel.org>, Mark
 Rutland <mark.rutland@....com>, Alexander Shishkin
	<alexander.shishkin@...ux.intel.com>, Arnaldo Carvalho de Melo
	<acme@...nel.org>, Mingwei Zhang <mizhang@...gle.com>, Namhyung Kim
	<namhyung@...nel.org>, <oliver.sang@...el.com>
Subject: Re: [PATCH v2 2/2] perf/core: Reduce PMU access to adjust sample freq



Hello,

kernel test robot noticed "WARNING:at_arch/x86/events/core.c:#x86_pmu_start" on:

commit: d6da92786f901cc4ce3588f101182758da295dbb ("[PATCH v2 2/2] perf/core: Reduce PMU access to adjust sample freq")
url: https://github.com/intel-lab-lkp/linux/commits/namhyung-kernel-org/perf-core-Reduce-PMU-access-to-adjust-sample-freq/20240112-044505
base: https://git.kernel.org/cgit/linux/kernel/git/perf/perf-tools-next.git perf-tools-next
patch link: https://lore.kernel.org/all/20240111204348.669673-2-namhyung@kernel.org/
patch subject: [PATCH v2 2/2] perf/core: Reduce PMU access to adjust sample freq

in testcase: will-it-scale
version: will-it-scale-x86_64-75f66e4-1_20240111
with following parameters:

	nr_task: 16
	mode: thread
	test: pipe1
	cpufreq_governor: performance



compiler: gcc-12
test machine: 104 threads 2 sockets (Skylake) with 192G memory

(please refer to attached dmesg/kmsg for entire log/backtrace)



If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@...el.com>
| Closes: https://lore.kernel.org/oe-lkp/202401191023.d52a4ad4-oliver.sang@intel.com


[  102.087071][   C94] ------------[ cut here ]------------
[ 102.092623][ C94] WARNING: CPU: 94 PID: 0 at arch/x86/events/core.c:1507 x86_pmu_start (arch/x86/events/core.c:1507 (discriminator 1)) 
[  102.101826][   C94] Modules linked in: intel_rapl_msr intel_rapl_common skx_edac nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp btrfs coretemp blake2b_generic xor kvm_intel kvm raid6_pq libcrc32c irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel sha512_ssse3 sd_mod sg rapl ipmi_ssif nvme nvme_core ahci intel_cstate acpi_ipmi t10_pi libahci ast crc64_rocksoft_generic drm_shmem_helper mei_me ipmi_si crc64_rocksoft i2c_i801 ioatdma libata intel_uncore drm_kms_helper joydev crc64 mei ipmi_devintf lpc_ich i2c_smbus intel_pch_thermal dca wmi ipmi_msghandler acpi_pad acpi_power_meter drm fuse ip_tables
[  102.158393][   C94] CPU: 94 PID: 0 Comm: swapper/94 Not tainted 6.7.0-rc6-00192-gd6da92786f90 #1
[ 102.167472][ C94] RIP: 0010:x86_pmu_start (arch/x86/events/core.c:1507 (discriminator 1)) 
[ 102.172832][ C94] Code: 00 00 4c 0f ab 65 00 48 89 df e8 16 08 01 00 48 89 df 5b 5d 41 5c e9 4a c6 33 00 0f 0b 5b 5d 41 5c c3 cc cc cc cc 0f 0b eb f3 <0f> 0b eb b6 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 2e 0f 1f 84 00
All code
========
   0:	00 00                	add    %al,(%rax)
   2:	4c 0f ab 65 00       	bts    %r12,0x0(%rbp)
   7:	48 89 df             	mov    %rbx,%rdi
   a:	e8 16 08 01 00       	callq  0x10825
   f:	48 89 df             	mov    %rbx,%rdi
  12:	5b                   	pop    %rbx
  13:	5d                   	pop    %rbp
  14:	41 5c                	pop    %r12
  16:	e9 4a c6 33 00       	jmpq   0x33c665
  1b:	0f 0b                	ud2    
  1d:	5b                   	pop    %rbx
  1e:	5d                   	pop    %rbp
  1f:	41 5c                	pop    %r12
  21:	c3                   	retq   
  22:	cc                   	int3   
  23:	cc                   	int3   
  24:	cc                   	int3   
  25:	cc                   	int3   
  26:	0f 0b                	ud2    
  28:	eb f3                	jmp    0x1d
  2a:*	0f 0b                	ud2    		<-- trapping instruction
  2c:	eb b6                	jmp    0xffffffffffffffe4
  2e:	66 66 2e 0f 1f 84 00 	data16 nopw %cs:0x0(%rax,%rax,1)
  35:	00 00 00 00 
  39:	66                   	data16
  3a:	66                   	data16
  3b:	2e                   	cs
  3c:	0f                   	.byte 0xf
  3d:	1f                   	(bad)  
  3e:	84 00                	test   %al,(%rax)

Code starting with the faulting instruction
===========================================
   0:	0f 0b                	ud2    
   2:	eb b6                	jmp    0xffffffffffffffba
   4:	66 66 2e 0f 1f 84 00 	data16 nopw %cs:0x0(%rax,%rax,1)
   b:	00 00 00 00 
   f:	66                   	data16
  10:	66                   	data16
  11:	2e                   	cs
  12:	0f                   	.byte 0xf
  13:	1f                   	(bad)  
  14:	84 00                	test   %al,(%rax)
[  102.192917][   C94] RSP: 0018:ffffc9000ddb0e00 EFLAGS: 00010046
[  102.199175][   C94] RAX: 0000000000000001 RBX: ffff88b01d17a290 RCX: 0000000000000349
[  102.207339][   C94] RDX: 0000000000002ff0 RSI: 0000000000000002 RDI: ffff88b01d17a290
[  102.215509][   C94] RBP: ffff88afa149a220 R08: 0000000000000000 R09: 0000000000000014
[  102.223684][   C94] R10: 000000000000000f R11: 00000000000f4240 R12: 0000000000000003
[  102.231855][   C94] R13: 0000000000000001 R14: ffff88afa14b9680 R15: 000000000000005e
[  102.240038][   C94] FS:  0000000000000000(0000) GS:ffff88afa1480000(0000) knlGS:0000000000000000
[  102.249178][   C94] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  102.255986][   C94] CR2: 00007f9cdf69ec98 CR3: 000000303e01c002 CR4: 00000000007706f0
[  102.264179][   C94] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  102.272365][   C94] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  102.280552][   C94] PKRU: 55555554
[  102.284322][   C94] Call Trace:
[  102.287830][   C94]  <IRQ>
[ 102.290895][ C94] ? x86_pmu_start (arch/x86/events/core.c:1507 (discriminator 1)) 
[ 102.295695][ C94] ? __warn (kernel/panic.c:677) 
[ 102.299980][ C94] ? x86_pmu_start (arch/x86/events/core.c:1507 (discriminator 1)) 
[ 102.304768][ C94] ? report_bug (lib/bug.c:180 lib/bug.c:219) 
[ 102.309473][ C94] ? handle_bug (arch/x86/kernel/traps.c:237) 
[ 102.314006][ C94] ? exc_invalid_op (arch/x86/kernel/traps.c:258 (discriminator 1)) 
[ 102.318879][ C94] ? asm_exc_invalid_op (arch/x86/include/asm/idtentry.h:568) 
[ 102.324101][ C94] ? x86_pmu_start (arch/x86/events/core.c:1507 (discriminator 1)) 
[ 102.328888][ C94] perf_adjust_freq_unthr_events (kernel/events/core.c:4181 (discriminator 4)) 
[ 102.335069][ C94] perf_adjust_freq_unthr_context (kernel/events/core.c:4216) 
[ 102.341244][ C94] perf_event_task_tick (arch/x86/include/asm/current.h:41 kernel/events/core.c:4363) 
[ 102.346458][ C94] scheduler_tick (kernel/sched/core.c:5665) 
[ 102.351240][ C94] update_process_times (kernel/time/timer.c:2079) 
[ 102.356442][ C94] tick_sched_handle (kernel/time/tick-sched.c:256) 
[ 102.361381][ C94] tick_nohz_highres_handler (kernel/time/tick-sched.c:1525) 
[ 102.367021][ C94] ? __pfx_tick_nohz_highres_handler (kernel/time/tick-sched.c:1503) 
[ 102.373345][ C94] __hrtimer_run_queues (kernel/time/hrtimer.c:1688 kernel/time/hrtimer.c:1752) 
[ 102.378720][ C94] hrtimer_interrupt (kernel/time/hrtimer.c:1817) 
[ 102.383748][ C94] __sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1065 arch/x86/kernel/apic/apic.c:1082) 
[ 102.389818][ C94] sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1076 (discriminator 14)) 
[  102.395636][   C94]  </IRQ>
[  102.398759][   C94]  <TASK>
[ 102.401872][ C94] asm_sysvec_apic_timer_interrupt (arch/x86/include/asm/idtentry.h:649) 
[ 102.408032][ C94] RIP: 0010:cpuidle_enter_state (drivers/cpuidle/cpuidle.c:291) 
[ 102.414008][ C94] Code: 00 e8 9e 46 19 ff e8 d9 f1 ff ff 8b 53 04 49 89 c5 0f 1f 44 00 00 31 ff e8 07 2e 18 ff 45 84 ff 0f 85 d2 00 00 00 fb 45 85 f6 <0f> 88 83 01 00 00 49 63 d6 48 8d 04 52 48 8d 04 82 49 8d 0c c4 48
All code
========
   0:	00 e8                	add    %ch,%al
   2:	9e                   	sahf   
   3:	46 19 ff             	rex.RX sbb %r15d,%edi
   6:	e8 d9 f1 ff ff       	callq  0xfffffffffffff1e4
   b:	8b 53 04             	mov    0x4(%rbx),%edx
   e:	49 89 c5             	mov    %rax,%r13
  11:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
  16:	31 ff                	xor    %edi,%edi
  18:	e8 07 2e 18 ff       	callq  0xffffffffff182e24
  1d:	45 84 ff             	test   %r15b,%r15b
  20:	0f 85 d2 00 00 00    	jne    0xf8
  26:	fb                   	sti    
  27:	45 85 f6             	test   %r14d,%r14d
  2a:*	0f 88 83 01 00 00    	js     0x1b3		<-- trapping instruction
  30:	49 63 d6             	movslq %r14d,%rdx
  33:	48 8d 04 52          	lea    (%rdx,%rdx,2),%rax
  37:	48 8d 04 82          	lea    (%rdx,%rax,4),%rax
  3b:	49 8d 0c c4          	lea    (%r12,%rax,8),%rcx
  3f:	48                   	rex.W

Code starting with the faulting instruction
===========================================
   0:	0f 88 83 01 00 00    	js     0x189
   6:	49 63 d6             	movslq %r14d,%rdx
   9:	48 8d 04 52          	lea    (%rdx,%rdx,2),%rax
   d:	48 8d 04 82          	lea    (%rdx,%rax,4),%rax
  11:	49 8d 0c c4          	lea    (%r12,%rax,8),%rcx
  15:	48                   	rex.W


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240119/202401191023.d52a4ad4-oliver.sang@intel.com



-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ