[<prev] [next>] [day] [month] [year] [list]
Message-ID: <CA+G9fYtOCNQooW75is8yYBiJkGvNu52b1XoYP+99XwfvHPoNrA@mail.gmail.com>
Date: Wed, 4 Sep 2024 18:47:48 +0530
From: Naresh Kamboju <naresh.kamboju@...aro.org>
To: open list <linux-kernel@...r.kernel.org>, LTP List <ltp@...ts.linux.it>,
lkft-triage@...ts.linaro.org
Cc: Peter Zijlstra <peterz@...radead.org>, Valentin Schneider <vschneid@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>, Dan Carpenter <dan.carpenter@...aro.org>,
Arnd Bergmann <arnd@...db.de>, Anders Roxell <anders.roxell@...aro.org>,
Richard Palethorpe <rpalethorpe@...e.com>, chrubis <chrubis@...e.cz>
Subject: next: WARNING: at kernel/sched/fair.c:6058 unthrottle_cfs_rq
(kernel/sched/fair.c:6058 (discriminator 1))
The following kernel warning is noticed on DUT arm64 Juno-r2 and x86 devices
and Virtual environment qemu-arm64 and qemu-x86_64 running Linux next.
This is not a new regression and we have been noticing this warning on
the Linux next kernel as per the available data.
List of devices encountering this kernel warning.
- dragonboard-410c
- juno-r2
- qemu-arm64
- qemu-riscv64
- qemu-x86_64
- x86_64
Anders bisected this down to,
# first bad commit:
[2e0199df252a536a03f4cb0810324dff523d1e79]
sched/fair: Prepare exit/cleanup paths for delayed_dequeue
Reported-by: Linux Kernel Functional Testing <lkft@...aro.org>
Test log qemu-arm64 :
----------
[ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x000f0510]
[ 0.000000] Linux version 6.11.0-rc6-next-20240902
(tuxmake@...make) (aarch64-linux-gnu-gcc (Debian 13.3.0-5) 13.3.0, GNU
ld (GNU Binutils for Debian) 2.43) #1 SMP PREEMPT @1725295942
[ 0.000000] KASLR enabled
[ 0.000000] random: crng init done
[ 0.000000] Machine model: linux,dummy-virt
....
cfs_bandwidth01.c:58: TINFO: Set 'worker1/cpu.max' = '3000 10000'
cfs_bandwidth01.c:58: TINFO: Set 'worker2/cpu.max' = '2000 10000'
cfs_bandwidth01.c:58: TINFO: Set 'worker3/cpu.max' = '3000 10000'
cfs_bandwidth01.c:121: TPASS: Scheduled bandwidth constrained workers
cfs_bandwidth01.c:58: TINFO: Set 'level2/cpu.max' = '5000 10000'
cfs_bandwidth01.c:133: TPASS: Workers exited
cfs_bandwidth01.c:121: TPASS: Scheduled bandwidth constrained workers
<4>[ 76.364066] ------------[ cut here ]------------
<4>[ 76.364786] se->sched_delayed
<4>[ 76.365535] WARNING: CPU: 0 PID: 0 at kernel/sched/fair.c:6058
unthrottle_cfs_rq (kernel/sched/fair.c:6058 (discriminator 1))
<4>[ 76.366982] Modules linked in: crct10dif_ce sm3_ce sm3 sha3_ce
sha512_ce sha512_arm64 fuse drm backlight dm_mod ip_tables x_tables
<4>[ 76.369703] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted
6.11.0-rc6-next-20240902 #1
<4>[ 76.370575] Hardware name: linux,dummy-virt (DT)
<4>[ 76.371264] pstate: 624000c9 (nZCv daIF +PAN -UAO +TCO -DIT
-SSBS BTYPE=--)
<4>[ 76.371928] pc : unthrottle_cfs_rq (kernel/sched/fair.c:6058
(discriminator 1))
<4>[ 76.372353] lr : unthrottle_cfs_rq (kernel/sched/fair.c:6058
(discriminator 1))
<4>[ 76.372811] sp : ffff800080003d40
<4>[ 76.373158] x29: ffff800080003d40 x28: fff00000ff4c8fc0 x27:
ffff942fdc51e4ae
<4>[ 76.373978] x26: 0000000000000080 x25: 0000000000000000 x24:
0000000000000001
<4>[ 76.374670] x23: fff00000ff4c8fc0 x22: 0000000000000000 x21:
fff0000005eec400
<4>[ 76.375312] x20: fff0000005f29800 x19: fff00000044d4c00 x18:
0000000000000006
<4>[ 76.375955] x17: fff06bd123665000 x16: ffff800080000000 x15:
ffff8000800036d0
<4>[ 76.376600] x14: 0000000000000000 x13: 646579616c65645f x12:
64656863733e2d65
<4>[ 76.377284] x11: fffffffffffe0000 x10: ffff942fdbfe0238 x9 :
ffff942fd9944c54
<4>[ 76.378034] x8 : 00000000ffffefff x7 : ffff942fdbfde390 x6 :
0000000000000147
<4>[ 76.378675] x5 : 0000000000000148 x4 : 40000000fffff147 x3 :
0000000000000000
<4>[ 76.379341] x2 : 0000000000000000 x1 : 0000000000000000 x0 :
ffff942fdbf6ab00
<4>[ 76.380133] Call trace:
<4>[ 76.380430] unthrottle_cfs_rq (kernel/sched/fair.c:6058 (discriminator 1))
<4>[ 76.380910] distribute_cfs_runtime (kernel/sched/fair.c:6254)
<4>[ 76.381345] sched_cfs_period_timer (kernel/sched/fair.c:6307
kernel/sched/fair.c:6525)
<4>[ 76.381765] __hrtimer_run_queues (kernel/time/hrtimer.c:1691
kernel/time/hrtimer.c:1755)
<4>[ 76.382253] hrtimer_interrupt (kernel/time/hrtimer.c:1820)
<4>[ 76.382629] arch_timer_handler_phys
(drivers/clocksource/arm_arch_timer.c:675
drivers/clocksource/arm_arch_timer.c:692)
<4>[ 76.383040] handle_percpu_devid_irq (kernel/irq/chip.c:942
(discriminator 2))
<4>[ 76.383442] generic_handle_domain_irq (kernel/irq/irqdesc.c:693
kernel/irq/irqdesc.c:748)
<4>[ 76.383858] gic_handle_irq (drivers/irqchip/irq-gic.c:344 (discriminator 1))
<4>[ 76.384219] call_on_irq_stack (arch/arm64/kernel/entry.S:895)
<4>[ 76.384591] do_interrupt_handler (arch/arm64/kernel/entry-common.c:310)
<4>[ 76.385011] el1_interrupt (arch/arm64/kernel/entry-common.c:537
arch/arm64/kernel/entry-common.c:551)
<4>[ 76.385377] el1h_64_irq_handler (arch/arm64/kernel/entry-common.c:557)
<4>[ 76.385779] el1h_64_irq (arch/arm64/kernel/entry.S:594)
<4>[ 76.386138] __schedule (kernel/sched/sched.h:1501 kernel/sched/core.c:6681)
<4>[ 76.386481] schedule_idle
(include/asm-generic/bitops/generic-non-atomic.h:128
include/linux/thread_info.h:192 include/linux/sched.h:2109
kernel/sched/core.c:6796)
<4>[ 76.386822] do_idle (kernel/sched/idle.c:358)
<4>[ 76.387149] cpu_startup_entry (kernel/sched/idle.c:423)
<4>[ 76.387510] rest_init (main.c:?)
<4>[ 76.387837] start_kernel (init/main.c:915 (discriminator 1))
<4>[ 76.388183] __primary_switched (arch/arm64/kernel/head.S:244)
<4>[ 76.388714] ---[ end trace 0000000000000000 ]---
<4>[ 76.391991] ------------[ cut here ]------------
<4>[ 76.392496] delay && se->sched_delayed
<4>[ 76.392585] WARNING: CPU: 1 PID: 0 at kernel/sched/fair.c:5486
dequeue_entity (kernel/sched/fair.c:5486 (discriminator 1))
<4>[ 76.393578] Modules linked in: crct10dif_ce sm3_ce sm3 sha3_ce
sha512_ce sha512_arm64 fuse drm backlight dm_mod ip_tables x_tables
<4>[ 76.400266] CPU: 1 UID: 0 PID: 0 Comm: swapper/1 Tainted: G
W 6.11.0-rc6-next-20240902 #1
<4>[ 76.401509] Tainted: [W]=WARN
<4>[ 76.402412] Hardware name: linux,dummy-virt (DT)
<4>[ 76.403417] pstate: 624000c9 (nZCv daIF +PAN -UAO +TCO -DIT
-SSBS BTYPE=--)
<4>[ 76.404614] pc : dequeue_entity (kernel/sched/fair.c:5486 (discriminator 1))
<4>[ 76.405630] lr : dequeue_entity (kernel/sched/fair.c:5486 (discriminator 1))
<4>[ 76.406618] sp : ffff80008000bc00
<4>[ 76.407479] x29: ffff80008000bc00 x28: 0000000000000000 x27:
0000000000000009
<4>[ 76.408828] x26: fff0000005f29800 x25: fff0000004601300 x24:
0000000000000008
<4>[ 76.410693] x23: 0000000000000001 x22: 0000000000000000 x21:
0000000000000009
<4>[ 76.411616] x20: fff0000005f29800 x19: fff00000044d4c00 x18:
0000000000000006
<4>[ 76.413420] x17: fff06bd123687000 x16: ffff800080008000 x15:
ffff80008000b590
<4>[ 76.414827] x14: 0000000000000000 x13: 646579616c65645f x12:
64656863733e2d65
<4>[ 76.415752] x11: fffffffffffe0000 x10: ffff942fdbfe0640 x9 :
ffff942fd9944c54
<4>[ 76.417555] x8 : 00000000ffffefff x7 : ffff942fdbfde390 x6 :
0000000000000172
<4>[ 76.418529] x5 : 0000000000000173 x4 : 40000000fffff172 x3 :
0000000000000000
<4>[ 76.419887] x2 : 0000000000000000 x1 : 0000000000000000 x0 :
fff0000003c83900
<4>[ 76.421775] Call trace:
<4>[ 76.422157] dequeue_entity (kernel/sched/fair.c:5486 (discriminator 1))
<4>[ 76.423143] dequeue_entities (kernel/sched/fair.c:7099 (discriminator 1))
<4>[ 76.424546] dequeue_task_fair (kernel/sched/fair.c:7187 (discriminator 1))
<4>[ 76.425129] deactivate_task (kernel/sched/core.c:2075)
<4>[ 76.426155] sched_balance_rq (kernel/sched/fair.c:9378
kernel/sched/fair.c:9513 kernel/sched/fair.c:11657)
<4>[ 76.427142] sched_balance_domains (kernel/sched/fair.c:12078
(discriminator 1))
<4>[ 76.428151] sched_balance_softirq (kernel/sched/fair.c:12791)
<4>[ 76.429218] handle_softirqs
(arch/arm64/include/asm/jump_label.h:32 include/linux/jump_label.h:207
include/trace/events/irq.h:142 kernel/softirq.c:555)
<4>[ 76.430268] __do_softirq (kernel/softirq.c:589)
<4>[ 76.431615] ____do_softirq (arch/arm64/kernel/irq.c:82)
<4>[ 76.432535] call_on_irq_stack (arch/arm64/kernel/entry.S:895)
<4>[ 76.433171] do_softirq_own_stack (arch/arm64/kernel/irq.c:87)
<4>[ 76.434232] irq_exit_rcu (kernel/softirq.c:435
kernel/softirq.c:637 kernel/softirq.c:649)
<4>[ 76.435577] el1_interrupt (arch/arm64/include/asm/current.h:19
arch/arm64/kernel/entry-common.c:280
arch/arm64/kernel/entry-common.c:539
arch/arm64/kernel/entry-common.c:551)
<4>[ 76.436496] el1h_64_irq_handler (arch/arm64/kernel/entry-common.c:557)
<4>[ 76.437081] el1h_64_irq (arch/arm64/kernel/entry.S:594)
<4>[ 76.438464] default_idle_call (kernel/sched/idle.c:126)
<4>[ 76.439004] do_idle (kernel/sched/idle.c:186 kernel/sched/idle.c:326)
<4>[ 76.439910] cpu_startup_entry (kernel/sched/idle.c:423)
<4>[ 76.440930] secondary_start_kernel
(arch/arm64/include/asm/atomic_ll_sc.h:95 (discriminator 2)
arch/arm64/include/asm/atomic.h:28 (discriminator 2)
include/linux/atomic/atomic-arch-fallback.h:546 (discriminator 2)
include/linux/atomic/atomic-arch-fallback.h:994 (discriminator 2)
include/linux/atomic/atomic-instrumented.h:436 (discriminator 2)
include/linux/sched/mm.h:37 (discriminator 2)
arch/arm64/kernel/smp.c:214 (discriminator 2))
<4>[ 76.442456] __secondary_switched (arch/arm64/kernel/head.S:418)
<4>[ 76.443021] ---[ end trace 0000000000000000 ]---
cfs_bandwidth01.c:58: TINFO: Set 'level2/cpu.max' = '5000 10000'
cfs_bandwidth01.c:133: TPASS: Workers exited
cfs_bandwidth01.c:121: TPASS: Scheduled bandwidth constrained workers
cfs_bandwidth01.c:58: TINFO: Set 'level2/cpu.max' = '5000 10000'
cfs_bandwidth01.c:133: TPASS: Workers exited
cfs_bandwidth01.c:121: TPASS: Scheduled bandwidth constrained workers
cfs_bandwidth01.c:58: TINFO: Set 'level2/cpu.max' = '5000 10000'
cfs_bandwidth01.c:133: TPASS: Workers exited
cfs_bandwidth01.c:121: TPASS: Scheduled bandwidth constrained workers
cfs_bandwidth01.c:58: TINFO: Set 'level2/cpu.max' = '5000 10000'
cfs_bandwidth01.c:133: TPASS: Workers exited
tst_test.c:1660: TFAIL: Kernel is now tainted.
HINT: You _MAY_ be missing kernel fixes:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=39f23ce07b93
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b34cb07dde7c
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fe61468b2cbc
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5ab297bab984
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6d4d22468dae
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fdaba61ef8a2
Summary:
passed 10
failed 1
Links:
------
- https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20240902/testrun/25012571/suite/log-parser-test/tests/
- https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20240902/testrun/25012571/suite/log-parser-test/test/check-kernel-exception-delay-se-sched_delayed-7f35581d9865db33d9f09972c01ae7b6cb8a509142c92c54add4efc4117697cf/log
Test history compare log:
------------
- https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20240902/testrun/25012571/suite/log-parser-test/test/check-kernel-exception-se-sched_delayed/history/
metadata:
----
git describe: next-20240902
git repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next
git sha: ecc768a84f0b8e631986f9ade3118fa37852fef0
kernel config:
https://storage.tuxsuite.com/public/linaro/lkft/builds/2lWV3UVei3To0rSt5txVKQouoWS/config
kernel version: 6.11.0-rc6
build url: https://storage.tuxsuite.com/public/linaro/lkft/builds/2lWV3UVei3To0rSt5txVKQouoWS/
toolchain: clang-18 and gcc-13
arch: arm64 and x86_64
Steps to reproduce:
---------
- https://storage.tuxsuite.com/public/linaro/lkft/builds/2lWV3UVei3To0rSt5txVKQouoWS/tuxmake_reproducer.sh
- https://storage.tuxsuite.com/public/linaro/lkft/builds/2lWV3UVei3To0rSt5txVKQouoWS/tux_plan.yaml
--
Linaro LKFT
https://lkft.linaro.org
Powered by blists - more mailing lists