lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <ZILURoD7J3eVtXsV@xsang-OptiPlex-9020>
Date:   Fri, 9 Jun 2023 15:27:02 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     Alexander Potapenko <glider@...gle.com>
CC:     Andrew Morton <akpm@...ux-foundation.org>,
        Dipanjan Das <mail.dipanjan.das@...il.com>,
        Marco Elver <elver@...gle.com>,
        "Christoph Hellwig" <hch@...radead.org>,
        Dmitry Vyukov <dvyukov@...gle.com>,
        "Uladzislau Rezki" <urezki@...il.com>,
        LKML <linux-kernel@...r.kernel.org>, <lkp@...ts.01.org>,
        <lkp@...el.com>, <oliver.sang@...el.com>
Subject: [mm]  47ebd0310e: kernel_BUG_at_arch/x86/kernel/irqinit.c


hi, Alexander Potapenko,

in our below tests, both 47ebd0310e and its parent show various issues, but
we found some differences:

both 47ebd0310e and its parent will first show issue
  "WARNING:at_mm/vmalloc.c:#__vmap_pages_range_noflush"
such like:

[    0.010000][    T0] WARNING: CPU: 0 PID: 0 at mm/vmalloc.c:477 __vmap_pages_range_noflush+0x1182/0x1550

after it, for 47ebd0310e, it will show issue
  "kernel_BUG_at_arch/x86/kernel/irqinit.c" such like:

[    0.010000][    T0] kernel BUG at arch/x86/kernel/irqinit.c:89!

then crash soon:

[    0.010000][    T0] Kernel panic - not syncing: Fatal exception

however, for parent, we observed there is no
  "kernel_BUG_at_arch/x86/kernel/irqinit.c"
it could run further with other issues as listed in table within below full
report, until:

[   13.341722][    T1] general protection fault, maybe for address 0x0: 0000 [#1] PREEMPT SMP

then:

[   13.365591][    T1] Kernel panic - not syncing: Fatal exception

both dmesg for 47ebd0310e and its parent are attached.

for the difference, we have not enough idea whether they are meaningful
enough since after the first same issue:
  "WARNING:at_mm/vmalloc.c:#__vmap_pages_range_noflush"
(if meaningless, maybe this report could be ignored)
and what's the real relation with the code change in 47ebd0310e.

we just report what we found in our tests for your information and hope it
could supply some hints.

if you need more tests, please let us know. Thanks!

below is the detail report.


Greeting,

FYI, we noticed the following commit (built with clang-15):

commit: 47ebd0310e89c087f56e58c103c44b72a2f6b216 ("mm: kmsan: handle alloc failures in kmsan_vmap_pages_range_noflush()")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

in testcase: boot

on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):


+--------------------------------------------------------------+------------+------------+
|                                                              | a101482421 | 47ebd0310e |
+--------------------------------------------------------------+------------+------------+
| boot_successes                                               | 0          | 0          |
| boot_failures                                                | 9          | 6          |
| WARNING:at_mm/vmalloc.c:#__vmap_pages_range_noflush          | 9          | 6          |
| RIP:__vmap_pages_range_noflush                               | 9          | 6          |
| WARNING:at_mm/kmsan/shadow.c:#kmsan_vmap_pages_range_noflush | 9          |            |
| RIP:kmsan_vmap_pages_range_noflush                           | 9          |            |
| maybe_for_address#:#[##]                                     | 9          |            |
| RIP:memset                                                   | 9          |            |
| Kernel_panic-not_syncing:Fatal_exception                     | 9          | 6          |
| kernel_BUG_at_arch/x86/kernel/irqinit.c                      | 0          | 6          |
| invalid_opcode:#[##]                                         | 0          | 6          |
| RIP:init_IRQ                                                 | 0          | 6          |
+--------------------------------------------------------------+------------+------------+


If you fix the issue, kindly add following tag
Reported-by: kernel test robot <oliver.sang@...el.com>


[    0.010000][    T0] kernel BUG at arch/x86/kernel/irqinit.c:89!
[    0.010000][    T0] invalid opcode: 0000 [#1] PREEMPT SMP
[    0.010000][    T0] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W          6.3.0-rc4-00044-g47ebd0310e89 #1
[ 0.010000][ T0] RIP: 0010:init_IRQ (arch/x86/kernel/irqinit.c:89) 
[ 0.010000][ T0] Code: 3f cd fb e9 08 fe ff ff 8b 3a e8 56 3f cd fb e9 10 fe ff ff e8 4c 3f cd fb eb 9d 41 8b bc 24 a8 0f 00 00 e8 3d 3f cd fb eb aa <0f> 0b 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 57
All code
========
   0:	3f                   	(bad)
   1:	cd fb                	int    $0xfb
   3:	e9 08 fe ff ff       	jmp    0xfffffffffffffe10
   8:	8b 3a                	mov    (%rdx),%edi
   a:	e8 56 3f cd fb       	call   0xfffffffffbcd3f65
   f:	e9 10 fe ff ff       	jmp    0xfffffffffffffe24
  14:	e8 4c 3f cd fb       	call   0xfffffffffbcd3f65
  19:	eb 9d                	jmp    0xffffffffffffffb8
  1b:	41 8b bc 24 a8 0f 00 	mov    0xfa8(%r12),%edi
  22:	00 
  23:	e8 3d 3f cd fb       	call   0xfffffffffbcd3f65
  28:	eb aa                	jmp    0xffffffffffffffd4
  2a:*	0f 0b                	ud2		<-- trapping instruction
  2c:	66 0f 1f 84 00 00 00 	nopw   0x0(%rax,%rax,1)
  33:	00 00 
  35:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
  3a:	55                   	push   %rbp
  3b:	48 89 e5             	mov    %rsp,%rbp
  3e:	41 57                	push   %r15

Code starting with the faulting instruction
===========================================
   0:	0f 0b                	ud2
   2:	66 0f 1f 84 00 00 00 	nopw   0x0(%rax,%rax,1)
   9:	00 00 
   b:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
  10:	55                   	push   %rbp
  11:	48 89 e5             	mov    %rsp,%rbp
  14:	41 57                	push   %r15
[    0.010000][    T0] RSP: 0000:ffffffff85403e68 EFLAGS: 00010082
[    0.010000][    T0] RAX: 0000000000000000 RBX: 00000000fffffff4 RCX: 0000000000000000
[    0.010000][    T0] RDX: ffff88841f48d3d0 RSI: 0000000000000004 RDI: 0000000000262088
[    0.010000][    T0] RBP: ffffffff85403eb8 R08: ffffea000000000f R09: ffff88843ffbf000
[    0.010000][    T0] R10: ffff8883ef369a98 R11: 0000000000000000 R12: ffffffff85416b48
[    0.010000][    T0] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000010
[    0.010000][    T0] FS:  0000000000000000(0000) GS:ffff88843f600000(0000) knlGS:0000000000000000
[    0.010000][    T0] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.010000][    T0] CR2: ffff88843ffff000 CR3: 000000000545a000 CR4: 00000000000406b0
[    0.010000][    T0] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    0.010000][    T0] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[    0.010000][    T0] Call Trace:
[    0.010000][    T0]  <TASK>
[ 0.010000][ T0] start_kernel (init/main.c:1050) 
[ 0.010000][ T0] ? __msan_metadata_ptr_for_load_8 (arch/x86/include/asm/smap.h:56 mm/kmsan/instrumentation.c:37 mm/kmsan/instrumentation.c:92) 
[ 0.010000][ T0] x86_64_start_reservations (arch/x86/kernel/head64.c:557) 
[ 0.010000][ T0] x86_64_start_kernel (arch/x86/kernel/head64.c:538) 
[ 0.010000][ T0] secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:358) 
[    0.010000][    T0]  </TASK>
[    0.010000][    T0] Modules linked in:
[    0.010000][    T0] ---[ end trace 0000000000000000 ]---
[ 0.010000][ T0] RIP: 0010:init_IRQ (arch/x86/kernel/irqinit.c:89) 
[ 0.010000][ T0] Code: 3f cd fb e9 08 fe ff ff 8b 3a e8 56 3f cd fb e9 10 fe ff ff e8 4c 3f cd fb eb 9d 41 8b bc 24 a8 0f 00 00 e8 3d 3f cd fb eb aa <0f> 0b 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 57
All code
========
   0:	3f                   	(bad)
   1:	cd fb                	int    $0xfb
   3:	e9 08 fe ff ff       	jmp    0xfffffffffffffe10
   8:	8b 3a                	mov    (%rdx),%edi
   a:	e8 56 3f cd fb       	call   0xfffffffffbcd3f65
   f:	e9 10 fe ff ff       	jmp    0xfffffffffffffe24
  14:	e8 4c 3f cd fb       	call   0xfffffffffbcd3f65
  19:	eb 9d                	jmp    0xffffffffffffffb8
  1b:	41 8b bc 24 a8 0f 00 	mov    0xfa8(%r12),%edi
  22:	00 
  23:	e8 3d 3f cd fb       	call   0xfffffffffbcd3f65
  28:	eb aa                	jmp    0xffffffffffffffd4
  2a:*	0f 0b                	ud2		<-- trapping instruction
  2c:	66 0f 1f 84 00 00 00 	nopw   0x0(%rax,%rax,1)
  33:	00 00 
  35:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
  3a:	55                   	push   %rbp
  3b:	48 89 e5             	mov    %rsp,%rbp
  3e:	41 57                	push   %r15

Code starting with the faulting instruction
===========================================
   0:	0f 0b                	ud2
   2:	66 0f 1f 84 00 00 00 	nopw   0x0(%rax,%rax,1)
   9:	00 00 
   b:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
  10:	55                   	push   %rbp
  11:	48 89 e5             	mov    %rsp,%rbp
  14:	41 57                	push   %r15


To reproduce:

        # build kernel
	cd linux
	cp config-6.3.0-rc4-00044-g47ebd0310e89 .config
	make HOSTCC=clang-15 CC=clang-15 ARCH=x86_64 olddefconfig prepare modules_prepare bzImage modules
	make HOSTCC=clang-15 CC=clang-15 ARCH=x86_64 INSTALL_MOD_PATH=<mod-install-dir> modules_install
	cd <mod-install-dir>
	find lib/ | cpio -o -H newc --quiet | gzip > modules.cgz


        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp qemu -k <bzImage> -m modules.cgz job-script # job-script is attached in this email

        # if come across any failure that blocks the test,
        # please remove ~/.lkp and /lkp dir to run from a clean state.



-- 
0-DAY CI Kernel Test Service
https://01.org/lkp



View attachment "config-6.3.0-rc4-00044-g47ebd0310e89" of type "text/plain" (136272 bytes)

View attachment "job-script" of type "text/plain" (4778 bytes)

Download attachment "dmesg.xz" of type "application/x-xz" (5044 bytes)

Download attachment "dmesg-parent-a101482421.xz" of type "application/x-xz" (7928 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ