lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220317080504.GC735@xsang-OptiPlex-9020>
Date:   Thu, 17 Mar 2022 16:05:04 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     Muchun Song <songmuchun@...edance.com>
Cc:     Johannes Weiner <hannes@...xchg.org>,
        Alexander Viro <viro@...iv.linux.org.uk>,
        Alistair Popple <apopple@...dia.com>,
        Christoph Hellwig <hch@...radead.org>,
        Dan Williams <dan.j.williams@...el.com>,
        Hugh Dickins <hughd@...gle.com>, Jan Kara <jack@...e.cz>,
        "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
        Matthew Wilcox <willy@...radead.org>,
        Ralph Campbell <rcampbell@...dia.com>,
        Ross Zwisler <zwisler@...nel.org>,
        Xiongchun Duan <duanxiongchun@...edance.com>,
        Xiyu Yang <xiyuyang19@...an.edu.cn>,
        Yang Shi <shy828301@...il.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
        lkp@...el.com
Subject: [mm]  f886cdb769: kernel_BUG_at_include/linux/swapops.h



Greeting,

FYI, we noticed the following commit (built with gcc-9):

commit: f886cdb76920131b030ffae13e752d8d0ff440f0 ("mm: pvmw: add support for walking devmap pages")
url: https://github.com/0day-ci/linux/commits/Petr-Mladek/kthread-Make-it-clear-that-kthread_create_on_node-might-be-terminated-by-any-fatal-signal/20220315-182614

in testcase: will-it-scale
version: will-it-scale-x86_64-a34a85c-1_20220312
with following parameters:

	nr_task: 100%
	mode: process
	test: lock1
	cpufreq_governor: performance
	ucode: 0x2006c0a

test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale


on test machine: 104 threads 2 sockets Skylake with 192G memory

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):



If you fix the issue, kindly add following tag
Reported-by: kernel test robot <oliver.sang@...el.com>


[  163.838480][ T4483] kernel BUG at include/linux/swapops.h:258!
[  163.844868][ T4483] invalid opcode: 0000 [#1] SMP PTI
[  163.850544][ T4483] CPU: 103 PID: 4483 Comm: perf Not tainted 5.17.0-rc7-mm1-00347-gf886cdb76920 #1
[ 163.860216][ T4483] RIP: 0010:migration_entry_wait_on_locked (include/linux/swapops.h:258 mm/filemap.c:1412) 
[ 163.867512][ T4483] Code: 66 90 e9 73 fe ff ff 66 90 e9 3c ff ff ff 48 8b 43 08 a8 01 0f 85 84 00 00 00 66 90 48 89 d8 48 8b 00 a8 01 0f 85 10 fe ff ff <0f> 0b 48 8d 58 ff e9 16 fe ff ff 65 48 8b 04 25 00 ad 01 00 48 83
All code
========
   0:	66 90                	xchg   %ax,%ax
   2:	e9 73 fe ff ff       	jmpq   0xfffffffffffffe7a
   7:	66 90                	xchg   %ax,%ax
   9:	e9 3c ff ff ff       	jmpq   0xffffffffffffff4a
   e:	48 8b 43 08          	mov    0x8(%rbx),%rax
  12:	a8 01                	test   $0x1,%al
  14:	0f 85 84 00 00 00    	jne    0x9e
  1a:	66 90                	xchg   %ax,%ax
  1c:	48 89 d8             	mov    %rbx,%rax
  1f:	48 8b 00             	mov    (%rax),%rax
  22:	a8 01                	test   $0x1,%al
  24:	0f 85 10 fe ff ff    	jne    0xfffffffffffffe3a
  2a:*	0f 0b                	ud2    		<-- trapping instruction
  2c:	48 8d 58 ff          	lea    -0x1(%rax),%rbx
  30:	e9 16 fe ff ff       	jmpq   0xfffffffffffffe4b
  35:	65 48 8b 04 25 00 ad 	mov    %gs:0x1ad00,%rax
  3c:	01 00 
  3e:	48                   	rex.W
  3f:	83                   	.byte 0x83

Code starting with the faulting instruction
===========================================
   0:	0f 0b                	ud2    
   2:	48 8d 58 ff          	lea    -0x1(%rax),%rbx
   6:	e9 16 fe ff ff       	jmpq   0xfffffffffffffe21
   b:	65 48 8b 04 25 00 ad 	mov    %gs:0x1ad00,%rax
  12:	01 00 
  14:	48                   	rex.W
  15:	83                   	.byte 0x83
[  163.888105][ T4483] RSP: 0000:ffffc900233afd60 EFLAGS: 00010246
[  163.894594][ T4483] RAX: 0017ffffc0000000 RBX: ffffea000b000000 RCX: 000000000000001b
[  163.903001][ T4483] RDX: ffffea0004ae2468 RSI: 0000000000000000 RDI: 6c000000002c0000
[  163.911401][ T4483] RBP: 0400000000000080 R08: 0000000000100073 R09: ffff88812b890f50
[  163.919811][ T4483] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000e28
[  163.928219][ T4483] R13: 0400000000000000 R14: ffff8881ecf57450 R15: fff000003fffffff
[  163.936625][ T4483] FS:  00007f7ab9199d40(0000) GS:ffff88afa62c0000(0000) knlGS:0000000000000000
[  163.946000][ T4483] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  163.953026][ T4483] CR2: 00007f7ab8a06950 CR3: 000000016e806001 CR4: 00000000007706e0
[  163.961451][ T4483] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  163.969879][ T4483] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  163.978305][ T4483] PKRU: 55555554
[  163.982310][ T4483] Call Trace:
[  163.986055][ T4483]  <TASK>
[ 163.989441][ T4483] ? do_huge_pmd_numa_page (mm/huge_memory.c:1460) 
[ 163.995343][ T4483] __handle_mm_fault (mm/memory.c:4608 mm/memory.c:4704) 
[ 164.000726][ T4483] handle_mm_fault (mm/memory.c:4802) 
[ 164.005854][ T4483] do_user_addr_fault (arch/x86/mm/fault.c:1397) 
[ 164.011326][ T4483] exc_page_fault (arch/x86/include/asm/irqflags.h:40 arch/x86/include/asm/irqflags.h:75 arch/x86/mm/fault.c:1492 arch/x86/mm/fault.c:1540) 
[ 164.016375][ T4483] ? asm_exc_page_fault (arch/x86/include/asm/idtentry.h:568) 
[ 164.021757][ T4483] asm_exc_page_fault (arch/x86/include/asm/idtentry.h:568) 
[  164.027042][ T4483] RIP: 0033:0x5565db69fcde
[ 164.031894][ T4483] Code: f8 31 c0 48 8b 45 f8 64 48 33 04 25 28 00 00 00 75 02 c9 c3 e8 83 49 f3 ff 0f 1f 00 55 48 89 e5 41 54 53 48 89 fb 48 83 ec 10 <48> 8b bf a8 00 01 00 64 48 8b 04 25 28 00 00 00 48 89 45 e8 31 c0
All code
========
   0:	f8                   	clc    
   1:	31 c0                	xor    %eax,%eax
   3:	48 8b 45 f8          	mov    -0x8(%rbp),%rax
   7:	64 48 33 04 25 28 00 	xor    %fs:0x28,%rax
   e:	00 00 
  10:	75 02                	jne    0x14
  12:	c9                   	leaveq 
  13:	c3                   	retq   
  14:	e8 83 49 f3 ff       	callq  0xfffffffffff3499c
  19:	0f 1f 00             	nopl   (%rax)
  1c:	55                   	push   %rbp
  1d:	48 89 e5             	mov    %rsp,%rbp
  20:	41 54                	push   %r12
  22:	53                   	push   %rbx
  23:	48 89 fb             	mov    %rdi,%rbx
  26:	48 83 ec 10          	sub    $0x10,%rsp
  2a:*	48 8b bf a8 00 01 00 	mov    0x100a8(%rdi),%rdi		<-- trapping instruction
  31:	64 48 8b 04 25 28 00 	mov    %fs:0x28,%rax
  38:	00 00 
  3a:	48 89 45 e8          	mov    %rax,-0x18(%rbp)
  3e:	31 c0                	xor    %eax,%eax

Code starting with the faulting instruction
===========================================
   0:	48 8b bf a8 00 01 00 	mov    0x100a8(%rdi),%rdi
   7:	64 48 8b 04 25 28 00 	mov    %fs:0x28,%rax
   e:	00 00 
  10:	48 89 45 e8          	mov    %rax,-0x18(%rbp)
  14:	31 c0                	xor    %eax,%eax
[  164.052568][ T4483] RSP: 002b:00007fff1d7d6b40 EFLAGS: 00010202
[  164.059122][ T4483] RAX: 0000000000000000 RBX: 00007f7ab89f68a8 RCX: 0000000000000000
[  164.067592][ T4483] RDX: 0000000000001000 RSI: 0000000000401000 RDI: 00007f7ab89f68a8
[  164.076078][ T4483] RBP: 00007fff1d7d6b60 R08: 0000000000000a9f R09: 00005565dc5248a0
[  164.084546][ T4483] R10: 0000000000000000 R11: 0000000000000206 R12: 000000000000000c
[  164.093005][ T4483] R13: 00000000000c0960 R14: 00007fff1d7d9c20 R15: 0000000000000000
[  164.101455][ T4483]  </TASK>
[  164.104956][ T4483] Modules linked in: binfmt_misc btrfs blake2b_generic xor raid6_pq intel_rapl_msr intel_rapl_common zstd_compress libcrc32c ast drm_vram_helper drm_ttm_helper sd_mod ttm sg nvme skx_edac nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel ipmi_ssif drm_kms_helper rapl nvme_core syscopyarea sysfillrect t10_pi intel_cstate ahci sysimgblt acpi_ipmi libahci fb_sys_fops crc64_rocksoft_generic mei_me intel_uncore ipmi_si crc64_rocksoft drm crc64 ioatdma libata mei joydev ipmi_devintf intel_pch_thermal wmi dca ipmi_msghandler acpi_pad acpi_power_meter ip_tables
[  164.168156][ T4483] ---[ end trace 0000000000000000 ]---
[ 164.200833][ T4483] RIP: 0010:migration_entry_wait_on_locked (include/linux/swapops.h:258 mm/filemap.c:1412) 
[ 164.208245][ T4483] Code: 66 90 e9 73 fe ff ff 66 90 e9 3c ff ff ff 48 8b 43 08 a8 01 0f 85 84 00 00 00 66 90 48 89 d8 48 8b 00 a8 01 0f 85 10 fe ff ff <0f> 0b 48 8d 58 ff e9 16 fe ff ff 65 48 8b 04 25 00 ad 01 00 48 83
All code
========
   0:	66 90                	xchg   %ax,%ax
   2:	e9 73 fe ff ff       	jmpq   0xfffffffffffffe7a
   7:	66 90                	xchg   %ax,%ax
   9:	e9 3c ff ff ff       	jmpq   0xffffffffffffff4a
   e:	48 8b 43 08          	mov    0x8(%rbx),%rax
  12:	a8 01                	test   $0x1,%al
  14:	0f 85 84 00 00 00    	jne    0x9e
  1a:	66 90                	xchg   %ax,%ax
  1c:	48 89 d8             	mov    %rbx,%rax
  1f:	48 8b 00             	mov    (%rax),%rax
  22:	a8 01                	test   $0x1,%al
  24:	0f 85 10 fe ff ff    	jne    0xfffffffffffffe3a
  2a:*	0f 0b                	ud2    		<-- trapping instruction
  2c:	48 8d 58 ff          	lea    -0x1(%rax),%rbx
  30:	e9 16 fe ff ff       	jmpq   0xfffffffffffffe4b
  35:	65 48 8b 04 25 00 ad 	mov    %gs:0x1ad00,%rax
  3c:	01 00 
  3e:	48                   	rex.W
  3f:	83                   	.byte 0x83

Code starting with the faulting instruction
===========================================
   0:	0f 0b                	ud2    
   2:	48 8d 58 ff          	lea    -0x1(%rax),%rbx
   6:	e9 16 fe ff ff       	jmpq   0xfffffffffffffe21
   b:	65 48 8b 04 25 00 ad 	mov    %gs:0x1ad00,%rax
  12:	01 00 
  14:	48                   	rex.W
  15:	83                   	.byte 0x83


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        sudo bin/lkp install job.yaml           # job file is attached in this email
        bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
        sudo bin/lkp run generated-yaml-file

        # if come across any failure that blocks the test,
        # please remove ~/.lkp and /lkp dir to run from a clean state.



---
0-DAY CI Kernel Test Service
https://lists.01.org/hyperkitty/list/lkp@lists.01.org

Thanks,
Oliver Sang


View attachment "config-5.17.0-rc7-mm1-00347-gf886cdb76920" of type "text/plain" (139614 bytes)

View attachment "job-script" of type "text/plain" (7696 bytes)

Download attachment "dmesg.xz" of type "application/x-xz" (36604 bytes)

View attachment "job.yaml" of type "text/plain" (5251 bytes)

View attachment "reproduce" of type "text/plain" (342 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ