lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date: Sun, 30 Jun 2024 17:41:36 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Ingo Molnar <mingo@...nel.org>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
	<x86@...nel.org>, Oleg Nesterov <oleg@...hat.com>, Andy Lutomirski
	<luto@...nel.org>, Borislav Petkov <bp@...en8.de>, Fenghua Yu
	<fenghua.yu@...el.com>, "H. Peter Anvin" <hpa@...or.com>, Linus Torvalds
	<torvalds@...ux-foundation.org>, Dave Hansen <dave.hansen@...ux.intel.com>,
	Thomas Gleixner <tglx@...utronix.de>, Uros Bizjak <ubizjak@...il.com>,
	<oliver.sang@...el.com>
Subject: [tip:WIP.x86/fpu] [x86/fpu]  8073335229:
 WARNING:at_mm/vmalloc.c:#remove_vm_area



Hello,

kernel test robot noticed "WARNING:at_mm/vmalloc.c:#remove_vm_area" on:

commit: 80733352295340ab492002bd023ea8a1db8025f5 ("x86/fpu: Remove the thread::fpu pointer")
https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git WIP.x86/fpu

[test failed on linux-next/master 62c97045b8f720c2eac807a5f38e26c9ed512371]

in testcase: stress-ng
version: stress-ng-x86_64-ecd3fe291-1_20240612
with following parameters:

	nr_threads: 100%
	testtime: 60s
	test: sysfs
	cpufreq_governor: performance



compiler: gcc-13
test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 256G memory

(please refer to attached dmesg/kmsg for entire log/backtrace)



If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@...el.com>
| Closes: https://lore.kernel.org/oe-lkp/202406281539.7f1bd833-oliver.sang@intel.com


kern  :warn  : [  109.805527] ------------[ cut here ]------------
kern  :warn  : [  109.812194] Trying to vfree() bad address (00000000132062a9)
kern :warn : [  109.819933] WARNING: CPU: 84 PID: 1612 at mm/vmalloc.c:3195 remove_vm_area (mm/vmalloc.c:3195 (discriminator 1)) 
kern  :warn  : [  109.829916] Modules linked in: intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel btrfs kvm crct10dif_pclmul crc32_pclmul blake2b_generic ghash_clmulni_intel sd_mod sg sha512_ssse3 xor raid6_pq libcrc32c crc32c_intel rapl nvme nvme_core t10_pi intel_cstate ahci ast libahci ipmi_ssif mei_me i2c_i801 crc64_rocksoft_generic drm_shmem_helper crc64_rocksoft dax_hmem acpi_power_meter drm_kms_helper libata megaraid_sas mei wmi i2c_ismt crc64 i2c_smbus ipmi_si acpi_ipmi ipmi_devintf binfmt_misc ipmi_msghandler acpi_pad drm fuse loop dm_mod ip_tables
user  :notice: [  109.859859] stress-ng: metrc: [6575] stressor       bogo ops real time  usr time  sys time   bogo ops/s     bogo ops/s CPU used per       RSS Max
kern  :warn  : [  109.888492] CPU: 84 PID: 1612 Comm: kworker/84:1 Not tainted 6.10.0-rc3-00004-g807333522953 #1

kern  :warn  : [  109.905370] Workqueue: events delayed_vfree_work
user  :notice: [  109.918106] stress-ng: metrc: [6575]                           (secs)    (secs)    (secs)   (real time) (usr+sys time) instance (%)          (KB)

kern :warn : [  109.918484] RIP: 0010:remove_vm_area (mm/vmalloc.c:3195 (discriminator 1)) 

user  :notice: [  109.926459] stress-ng: metrc: [6575] sysfs            160196     60.00    120.35  12186.65      2669.78          13.02        91.56          1852
kern :warn : [ 109.940064] Code: 48 8b 70 08 e8 40 f2 ff ff 48 89 df e8 f8 ae ff ff 48 89 e8 5b 5d c3 cc cc cc cc 48 89 de 48 c7 c7 48 2c 99 82 e8 1f 74 d1 ff <0f> 0b 31 ed 5b 48 89 e8 5d c3 cc cc cc cc 90 90 90 90 90 90 90 90
All code
========
   0:	48 8b 70 08          	mov    0x8(%rax),%rsi
   4:	e8 40 f2 ff ff       	callq  0xfffffffffffff249
   9:	48 89 df             	mov    %rbx,%rdi
   c:	e8 f8 ae ff ff       	callq  0xffffffffffffaf09
  11:	48 89 e8             	mov    %rbp,%rax
  14:	5b                   	pop    %rbx
  15:	5d                   	pop    %rbp
  16:	c3                   	retq   
  17:	cc                   	int3   
  18:	cc                   	int3   
  19:	cc                   	int3   
  1a:	cc                   	int3   
  1b:	48 89 de             	mov    %rbx,%rsi
  1e:	48 c7 c7 48 2c 99 82 	mov    $0xffffffff82992c48,%rdi
  25:	e8 1f 74 d1 ff       	callq  0xffffffffffd17449
  2a:*	0f 0b                	ud2    		<-- trapping instruction
  2c:	31 ed                	xor    %ebp,%ebp
  2e:	5b                   	pop    %rbx
  2f:	48 89 e8             	mov    %rbp,%rax
  32:	5d                   	pop    %rbp
  33:	c3                   	retq   
  34:	cc                   	int3   
  35:	cc                   	int3   
  36:	cc                   	int3   
  37:	cc                   	int3   
  38:	90                   	nop
  39:	90                   	nop
  3a:	90                   	nop
  3b:	90                   	nop
  3c:	90                   	nop
  3d:	90                   	nop
  3e:	90                   	nop
  3f:	90                   	nop

Code starting with the faulting instruction
===========================================
   0:	0f 0b                	ud2    
   2:	31 ed                	xor    %ebp,%ebp
   4:	5b                   	pop    %rbx
   5:	48 89 e8             	mov    %rbp,%rax
   8:	5d                   	pop    %rbp
   9:	c3                   	retq   
   a:	cc                   	int3   
   b:	cc                   	int3   
   c:	cc                   	int3   
   d:	cc                   	int3   
   e:	90                   	nop
   f:	90                   	nop
  10:	90                   	nop
  11:	90                   	nop
  12:	90                   	nop
  13:	90                   	nop
  14:	90                   	nop
  15:	90                   	nop

user  :notice: [  109.943272] stress-ng: metrc: [6575] miscellaneous metrics:
kern  :warn  : [  109.948465] RSP: 0018:ffa00000118a7e20 EFLAGS: 00010286
kern  :warn  : [  109.948468] RAX: 0000000000000000 RBX: ff1100406ccdd340 RCX: 0000000000000000

user  :notice: [  109.952683] stress-ng: metrc: [6575] sysfs                188.37 sysfs files exercised per sec (harmonic mean of 224 instances)
kern  :warn  : [  109.966542] RDX: ff11003fc1a2e2c0 RSI: ff11003fc1a20b00 RDI: ff11003fc1a20b00

kern  :warn  : [  109.991046] RBP: ff1100406ccdd340 R08: 0000000000000000 R09: 0000000000000003
kern  :warn  : [  109.991048] R10: ffa00000118a7cb8 R11: ff1100407fcb7fe8 R12: ff1100208000fe00
kern  :warn  : [  109.991048] R13: ff11003fc1a33b00 R14: ff1100208000fe05 R15: ff110001000408c0
kern  :warn  : [  109.991049] FS:  0000000000000000(0000) GS:ff11003fc1a00000(0000) knlGS:0000000000000000
user  :notice: [  109.998850] stress-ng: info:  [6575] for a 60.09s run time:
kern  :warn  : [  110.004913] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033

user  :notice: [  110.014575] stress-ng: info:  [6575]   13460.96s available CPU time
kern  :warn  : [  110.016332] CR2: 000055666f401650 CR3: 000000407de9a006 CR4: 0000000000f71ef0

user  :notice: [  110.030915] stress-ng: info:  [6575]     120.35s user time   (  0.89%)
kern  :warn  : [  110.038976] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000

user  :notice: [  110.042309] stress-ng: info:  [6575]   12186.82s system time ( 90.53%)
kern  :warn  : [  110.050427] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
kern  :warn  : [  110.050429] PKRU: 55555554
kern  :warn  : [  110.050430] Call Trace:
kern  :warn  : [  110.050432]  <TASK>

user  :notice: [  110.060100] stress-ng: info:  [6575]   12307.17s total time  ( 91.43%)
kern :warn : [  110.068193] ? __warn (kernel/panic.c:693) 

user  :notice: [  110.078927] stress-ng: info:  [6575] load average: 476.09 137.10 47.15
kern :warn : [  110.085280] ? remove_vm_area (mm/vmalloc.c:3195 (discriminator 1)) 

kern :warn : [  110.095131] ? report_bug (lib/bug.c:180 lib/bug.c:219) 
user  :notice: [  110.103514] stress-ng: info:  [6575] skipped: 0
kern :warn : [  110.111930] ? handle_bug (arch/x86/kernel/traps.c:239) 

user  :notice: [  110.115126] stress-ng: info:  [6575] passed: 224: sysfs (224)
kern :warn : [  110.122694] ? exc_invalid_op (arch/x86/kernel/traps.c:260 (discriminator 1)) 
kern :warn : [  110.122697] ? asm_exc_invalid_op (arch/x86/include/asm/idtentry.h:621) 

user  :notice: [  110.132072] stress-ng: info:  [6575] failed: 0
kern :warn : [  110.134144] ? remove_vm_area (mm/vmalloc.c:3195 (discriminator 1)) 
kern :warn : [  110.134147] vfree (mm/vmalloc.c:3316 (discriminator 2)) 

user  :notice: [  110.143001] stress-ng: info:  [6575] metrics untrustworthy: 0
kern :warn : [  110.151243] delayed_vfree_work (mm/vmalloc.c:3266 (discriminator 1)) 
kern :warn : [  110.151246] process_one_work (kernel/workqueue.c:3231) 

user  :notice: [  110.156131] stress-ng: info:  [6575] successful run completed in 1 min, 0.09 secs
kern :warn : [  110.158797] worker_thread (kernel/workqueue.c:3306 (discriminator 2) kernel/workqueue.c:3393 (discriminator 2)) 

user  :notice: [  110.221173] /usr/bin/wget -q --timeout=3600 --tries=1 --local-encoding=UTF-8 http://10.239.97.5:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/lkp-spr-r02/stress-ng-performance-100%25-sysfs-60s-debian-12-x86_64-20240206.cgz-807333522953-20240625-54609-1oto9j9-13.yaml&job_state=post_run -O /dev/null
kern :warn : [  110.226000] ? __pfx_worker_thread (kernel/workqueue.c:3339) 

kern :warn : [  110.359162] ? __pfx_worker_thread (kernel/workqueue.c:3339) 
kern :warn : [  110.365216] kthread (kernel/kthread.c:389) 
kern :warn : [  110.369910] ? __pfx_kthread (kernel/kthread.c:342) 
kern :warn : [  110.375424] ret_from_fork (arch/x86/kernel/process.c:145) 
kern :warn : [  110.380820] ? __pfx_kthread (kernel/kthread.c:342) 
kern :warn : [  110.386304] ret_from_fork_asm (arch/x86/entry/entry_64.S:257) 
kern  :warn  : [  110.391956]  </TASK>
kern  :warn  : [  110.395774] ---[ end trace 0000000000000000 ]---


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240628/202406281539.7f1bd833-oliver.sang@intel.com



-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ