lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20190726081026.GE22106@shao2-debian>
Date:   Fri, 26 Jul 2019 16:10:26 +0800
From:   kernel test robot <lkp@...el.com>
To:     Ferdinand Blomqvist <ferdinand.blomqvist@...il.com>
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        LKML <linux-kernel@...r.kernel.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>, lkp@...org
Subject: [rslib] 4b4f3accd8: INFO:rcu_sched_self-detected_stall_on_CPU

FYI, we noticed the following commit (built with gcc-4.9):

commit: 4b4f3accd80304781c648b26ce4d53df082a4087 ("rslib: Add tests for the encoder and decoder")
https://kernel.googlesource.com/pub/scm/linux/kernel/git/torvalds/linux.git master

in testcase: locktorture
with following parameters:

	runtime: 300s
	test: default

test-description: This torture test consists of creating a number of kernel threads which acquire the lock and hold it for specific amount of time, thus simulating different critical region behaviors.
test-url: https://www.kernel.org/doc/Documentation/locking/locktorture.txt


on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 2G

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):


+----------------------------------------------------+----------+------------+
|                                                    | v5.2-rc6 | 4b4f3accd8 |
+----------------------------------------------------+----------+------------+
| boot_successes                                     | 10       | 0          |
| boot_failures                                      | 13       | 15         |
| BUG:kernel_reboot-without-warning_in_test_stage    | 13       |            |
| INFO:rcu_sched_self-detected_stall_on_CPU          | 0        | 15         |
| BUG:kernel_hang_in_boot-around-mounting-root_stage | 0        | 2          |
+----------------------------------------------------+----------+------------+


If you fix the issue, kindly add following tag
Reported-by: kernel test robot <lkp@...el.com>


[  115.336139] rcu: INFO: rcu_sched self-detected stall on CPU
[  115.336168] rcu: 	1-....: (31500 ticks this GP) idle=bea/1/0x4000000000000002 softirq=1099/1099 fqs=10102 
[  115.339505] 	(t=31502 jiffies g=297 q=94)
[  115.339505] NMI backtrace for cpu 1
[  115.339505] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.2.0-rc6-00001-g4b4f3accd80304 #1
[  115.342832] Call Trace:
[  115.342832]  <IRQ>
[  115.342832]  ? dump_stack+0x5e/0x8b
[  115.342832]  ? nmi_cpu_backtrace+0xac/0xb0
[  115.342832]  ? lapic_can_unplug_cpu+0xc0/0xc0
[  115.342832]  ? nmi_trigger_cpumask_backtrace+0xf5/0x130
[  115.342832]  ? rcu_dump_cpu_stacks+0x97/0xc5
[  115.342832]  ? rcu_sched_clock_irq+0x61b/0x820
[  115.342832]  ? update_process_times+0x24/0x40
[  115.342832]  ? tick_periodic+0x2b/0xc0
[  115.342832]  ? tick_periodic+0x5/0xc0
[  115.342832]  ? tick_handle_periodic+0x20/0x60
[  115.342832]  ? smp_apic_timer_interrupt+0x72/0x230
[  115.342832]  ? apic_timer_interrupt+0xf/0x20
[  115.342832]  </IRQ>
[  115.342832]  ? decode_rs16+0xa70/0xba0
[  115.342832]  ? ex_rs_helper+0x431/0x4e0
[  115.342832]  ? exercise_rs+0x51/0x90
[  115.342832]  ? test_rslib_init+0x174/0x222
[  115.342832]  ? sw842_init+0x12e/0x12e
[  115.342832]  ? do_one_initcall+0x66/0x2f4
[  115.342832]  ? rcu_read_lock_sched_held+0x49/0x90
[  115.342832]  ? kernel_init_freeable+0x14e/0x1d4
[  115.342832]  ? rest_init+0x260/0x260
[  115.342832]  ? kernel_init+0xa/0x100
[  115.342832]  ? ret_from_fork+0x24/0x30
[  116.137768]     FAIL: 52901 decoding failures!
[  116.140836] Testing beyond error correction capacity...
[  120.925031]     FAIL: 12047 silent failures!
[  120.927456] Testing (89,57)_256 code...
[  120.928101] Testing up to error correction capacity...
[  120.929096]   Testing correction buffer interface...
[  122.090172]     FAIL: 143306 decoding failures!
[  122.091517]   Testing with caller provided syndrome...
[  123.630870]     FAIL: 142416 decoding failures!
[  123.634364]   Testing in-place interface...
[  124.823230]     FAIL: 54638 decoding failures!
[  124.824023] Testing beyond error correction capacity...
[  128.812285]     FAIL: 11981 silent failures!
[  128.815469] Testing (33,1)_256 code...
[  128.816591] Testing up to error correction capacity...
[  128.818033]   Testing correction buffer interface...
[  129.652550]     FAIL: 143289 decoding failures!
[  129.653361]   Testing with caller provided syndrome...
[  130.616887]     FAIL: 143451 decoding failures!
[  130.620382]   Testing in-place interface...
[  131.451569]     FAIL: 55813 decoding failures!
[  131.452366] Testing beyond error correction capacity...
[  132.452832]     FAIL: 828 silent failures!
[  132.455008] Testing (511,478)_512 code...
[  132.455690] Testing up to error correction capacity...
[  132.456555]   Testing correction buffer interface...
[  136.053994]     FAIL: 180126 decoding failures!
[  136.057475]   Testing with caller provided syndrome...
[  136.412915] rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 1-... } 32000 jiffies s: 45 root: 0x2/.
[  136.415727] rcu: blocking rcu_node structures:
[  136.417003] Task dump for CPU 1:
[  136.417918] swapper/0       R  running task    13272     1      0 0x80004008
[  136.419847] Call Trace:
[  136.420530]  ? trace_hardirqs_on_thunk+0x1a/0x1c
[  136.421803]  ? trace_hardirqs_on_thunk+0x1a/0x1c
[  136.423157]  ? lockdep_hardirqs_on+0x148/0x1c0
[  136.424458]  ? trace_hardirqs_on_thunk+0x1a/0x1c
[  136.425815]  ? retint_kernel+0x10/0x10
[  136.426928]  ? encode_rs16+0x112/0x1c0
[  136.427996]  ? decode_rs16+0x4b5/0xba0
[  136.429120]  ? encode_rs16+0x12f/0x1c0
[  136.430282]  ? get_rcw_we+0xc3/0x1e0
[  136.432472]  ? ex_rs_helper+0x17e/0x4e0
[  136.433119]  ? exercise_rs+0x51/0x90
[  136.433726]  ? test_rslib_init+0x174/0x222
[  136.434395]  ? sw842_init+0x12e/0x12e
[  136.435091]  ? do_one_initcall+0x66/0x2f4
[  136.435742]  ? rcu_read_lock_sched_held+0x49/0x90
[  136.436599]  ? kernel_init_freeable+0x14e/0x1d4
[  136.437347]  ? rest_init+0x260/0x260
[  136.437927]  ? kernel_init+0xa/0x100
[  136.438561]  ? ret_from_fork+0x24/0x30
[  141.451983]     FAIL: 180645 decoding failures!
[  141.452783]   Testing in-place interface...
[  145.253649]     FAIL: 33968 decoding failures!
[  145.255895] Testing beyond error correction capacity...
[  157.031546]     FAIL: 615 silent failures!
[  157.034597] Testing (392,359)_512 code...
[  157.035286] Testing up to error correction capacity...
[  157.036163]   Testing correction buffer interface...
[  160.087885]     FAIL: 136129 decoding failures!
[  160.091092]   Testing with caller provided syndrome...
[  164.580667]     FAIL: 136686 decoding failures!
[  164.585268]   Testing in-place interface...
[  167.906792]     FAIL: 40765 decoding failures!
[  167.908221] Testing beyond error correction capacity...
[  177.647979]     FAIL: 10635 silent failures!
[  177.648758] Testing (273,240)_512 code...
[  177.649438] Testing up to error correction capacity...
[  177.650330]   Testing correction buffer interface...
[  180.025183]     FAIL: 127255 decoding failures!
[  180.026033]   Testing with caller provided syndrome...
[  183.238909]     FAIL: 127965 decoding failures!
[  183.241185]   Testing in-place interface...
[  185.607359]     FAIL: 42399 decoding failures!
[  185.611062] Testing beyond error correction capacity...
[  193.056046]     FAIL: 10640 silent failures!
[  193.058223] Testing (154,121)_512 code...
[  193.058910] Testing up to error correction capacity...
[  193.059822]   Testing correction buffer interface...
[  194.725534]     FAIL: 113721 decoding failures!
[  194.726391]   Testing with caller provided syndrome...
[  196.852541]     FAIL: 113704 decoding failures!
[  196.853360]   Testing in-place interface...
[  198.509609]     FAIL: 44881 decoding failures!
[  198.510949] Testing beyond error correction capacity...
[  203.828405]     FAIL: 10647 silent failures!
[  203.829176] Testing (34,1)_512 code...
[  203.829842] Testing up to error correction capacity...
[  203.830697]   Testing correction buffer interface...
[  204.862634]     FAIL: 114591 decoding failures!
[  204.863944]   Testing with caller provided syndrome...
[  206.040509]     FAIL: 113788 decoding failures!
[  206.041369]   Testing in-place interface...
[  207.098438]     FAIL: 47327 decoding failures!
[  207.099755] Testing beyond error correction capacity...
[  208.318645]     FAIL: 684 silent failures!
[  208.319433] rslib: test failed
[  208.324689] gpio_winbond: chip ID at 2e is ffff
[  208.325557] gpio_winbond: not an our chip
[  208.326340] gpio_winbond: chip ID at 4e is ffff
[  208.327099] gpio_winbond: not an our chip
[  208.328489] IPMI message handler: version 39.2
[  208.329360] ipmi device interface
[  208.330680] ipmi_si: IPMI System Interface driver
[  208.331703] ipmi_si: Unable to find any System Interface(s)
[  208.332636] ipmi_ssif: IPMI SSIF Interface driver
[  208.333477] IPMI Watchdog: driver initialized
[  208.334537] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input0
[  208.337452] ACPI: Power Button [PWRF]
[  208.426247] Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
[  208.451294] 00:05: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 16550A
[  208.483196] 00:06: ttyS1 at I/O 0x2f8 (irq = 3, base_baud = 115200) is a 16550A
[  208.491826] Non-volatile memory driver v1.3
[  208.495147] random: get_random_u64 called from arch_rnd+0x24/0x50 with crng_init=0
[  208.497119] random: get_random_u64 called from load_elf_binary+0x4a8/0x12fe with crng_init=0
[  208.499651] random: get_random_u64 called from randomize_page+0x54/0x71 with crng_init=0
[  208.504395] ppdev: user-space parallel port driver
[  208.507078] dummy-irq: no IRQ given.  Use irq=N
[  208.518834] mtdoops: mtd device (mtddev=name/number) must be supplied
[  208.523807] e1000: Intel(R) PRO/1000 Network Driver - version 7.3.21-k8-NAPI
[  208.525032] e1000: Copyright (c) 1999-2006 Intel Corporation.
[  208.891017] PCI Interrupt Link [LNKC] enabled at IRQ 11
[  209.324220] e1000 0000:00:03.0 eth0: (PCI:33MHz:32-bit) 52:54:00:12:34:56
[  209.326350] e1000 0000:00:03.0 eth0: Intel(R) PRO/1000 Network Connection
[  209.332846] panel: panel driver not yet registered
[  209.334739] i8042: PNP: PS/2 Controller [PNP0303:KBD,PNP0f13:MOU] at 0x60,0x64 irq 1,12
[  209.337684] serio: i8042 KBD port at 0x60,0x64 irq 1
[  209.338564] serio: i8042 AUX port at 0x60,0x64 irq 12


To reproduce:

        # build kernel
	cd linux
	cp config-5.2.0-rc6-00001-g4b4f3accd80304 .config
	make HOSTCC=gcc-4.9 CC=gcc-4.9 ARCH=x86_64 olddefconfig prepare modules_prepare bzImage modules
	make HOSTCC=gcc-4.9 CC=gcc-4.9 ARCH=x86_64 INSTALL_MOD_PATH=<mod-install-dir> modules_install
	cd <mod-install-dir>
	find lib/ | cpio -o -H newc --quiet | gzip > modules.cgz


        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp qemu -k <bzImage> -m modules.cgz job-script # job-script is attached in this email



Thanks,
lkp


View attachment "config-5.2.0-rc6-00001-g4b4f3accd80304" of type "text/plain" (108146 bytes)

View attachment "job-script" of type "text/plain" (4495 bytes)

Download attachment "dmesg.xz" of type "application/x-xz" (19484 bytes)

View attachment "locktorture" of type "text/plain" (1238 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ