[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aMoQ45j4KniTXv9H@xsang-OptiPlex-9020>
Date: Wed, 17 Sep 2025 09:37:39 +0800
From: Oliver Sang <oliver.sang@...el.com>
To: "Luis Claudio R. Goncalves" <lgoncalv@...hat.com>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
Peter Zijlstra <peterz@...radead.org>, Crystal Wood <crwood@...hat.com>,
Wander Lairson Costa <wander@...hat.com>, Valentin Schneider
<vschneid@...hat.com>, Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
<aubrey.li@...ux.intel.com>, <yu.c.chen@...el.com>, <oliver.sang@...el.com>
Subject: Re: [linus:master] [sched] 8671bad873:
INFO:task_blocked_for_more_than#seconds
hi, Luis,
On Tue, Sep 16, 2025 at 09:35:45PM -0300, Luis Claudio R. Goncalves wrote:
> On Mon, Sep 08, 2025 at 02:06:37PM -0300, Luis Claudio R. Goncalves wrote:
> > On Fri, Sep 05, 2025 at 10:49:35AM +0800, kernel test robot wrote:
> > >
> > >
> > > Hello,
> > >
> > > kernel test robot noticed "INFO:task_blocked_for_more_than#seconds" on:
> > >
> > > commit: 8671bad873ebeb082afcf7b4501395c374da6023 ("sched: Do not call __put_task_struct() on rt if pi_blocked_on is set")
> > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> > >
> > > [test failed on linus/master 08b06c30a44555a8b1d14950e4462a52bfa0758b]
> > > [test failed on linux-next/master 5d50cf9f7cf20a17ac469c20a2e07c29c1f6aab7]
> > >
> > > in testcase: rcutorture
> > > version:
> > > with following parameters:
> > >
> > > runtime: 300s
> > > test: cpuhotplug
> > > torture_type: tasks-rude
> >
> > I ran tests with the boxes I had at hand, x86_64 and arm64, and was unable
> > to reproduce the problem. Tomorrow, when I am back from a holiday,, I will
> > try to reproduce the problem with x86 (32 bits) VMs and x86 baremetal as it
> > seems to be the case on the report.
>
> I have been trying to reproduce the problem for a week now, on both
> baremetal and VMs, x86 32 bits, without success. I tried to limit the
> amount of CPUs and memory, to mimic as well as possible the test
> environment but that has not changed the test results at all.
>
> Are there any other pointers to reproduce this problem? Other than what can
> be extracted from the log excerpts available, I mean.
if you have any debug patch, we could test it in our env.
the original issue is really quite random...
>
> Best regards.
> Luis
>
> > In any case, I sent a follow-up patch that isolated those changes to
> > kernels with PREEMPT_RT enabled, as initially intended. That should solve
> > this case (if really caused by the commit in question). The patch I
> > mentioned is:
> >
> > [RESEND PATCH] sched: restore the behavior of put_task_struct() for non-rt
> > https://lore.kernel.org/all/aKxqGLNOp2sWJwnZ@uudg.org/
> >
> > Best regards,
> > Luis
> >
> > >
> > >
> > > config: i386-randconfig-017-20250830
> > > compiler: gcc-12
> > > test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
> > >
> > > (please refer to attached dmesg/kmsg for entire log/backtrace)
> > >
> > >
> > > the issue happens randomly and we observed various issues in tests by this
> > > commit, while parent keeps clean.
> > >
> > > =========================================================================================
> > > tbox_group/testcase/rootfs/kconfig/compiler/runtime/test/torture_type:
> > > vm-snb/rcutorture/debian-11.1-i386-20220923.cgz/i386-randconfig-017-20250830/gcc-12/300s/cpuhotplug/tasks-rude
> > >
> > > 7de9d4f946383f48 8671bad873ebeb082afcf7b4501
> > > ---------------- ---------------------------
> > > fail:runs %reproduction fail:runs
> > > | | |
> > > :200 16% 32:200 dmesg.BUG:kernel_hang_in_boot_stage
> > > :200 0% 1:200 dmesg.BUG:soft_lockup-CPU##stuck_for#s![swapper:#]
> > > :200 1% 2:200 dmesg.BUG:workqueue_lockup-pool
> > > :200 0% 1:200 dmesg.EIP:kthread_affine_preferred
> > > :200 0% 1:200 dmesg.EIP:lock_release
> > > :200 0% 1:200 dmesg.EIP:tick_clock_notify
> > > :200 12% 23:200 dmesg.INFO:task_blocked_for_more_than#seconds
> > > :200 12% 23:200 dmesg.Kernel_panic-not_syncing:hung_task:blocked_tasks
> > > :200 0% 1:200 dmesg.Kernel_panic-not_syncing:softlockup:hung_tasks
> > > :200 0% 1:200 dmesg.WARNING:at_kernel/kthread.c:#kthread_affine_preferred
> > >
> > >
> > >
> > > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> > > the same patch/commit), kindly add following tags
> > > | Reported-by: kernel test robot <oliver.sang@...el.com>
> > > | Closes: https://lore.kernel.org/oe-lkp/202509051010.e06823ab-lkp@intel.com
> > >
> > >
> > > [ 994.935251][ T26] INFO: task swapper/0:1 blocked for more than 491 seconds.
> > > [ 994.947414][ T26] Not tainted 6.16.0-rc6-00086-g8671bad873eb #1
> > > [ 994.951523][ T26] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > > [ 994.960576][ T26] task:swapper/0 state:D stack:5016 pid:1 tgid:1 ppid:0 task_flags:0x0140 flags:0x00004000
> > > [ 994.972581][ T26] Call Trace:
> > > [ 994.998429][ T26] __schedule (kernel/sched/core.c:5354 kernel/sched/core.c:6954)
> > > [ 995.035758][ T26] schedule (kernel/sched/core.c:7037 kernel/sched/core.c:7051)
> > > [ 995.044863][ T26] async_synchronize_cookie_domain (kernel/async.c:317 kernel/async.c:310)
> > > [ 995.050698][ T26] ? do_wait_intr (kernel/sched/wait.c:384)
> > > [ 995.059798][ T26] wait_for_initramfs (init/initramfs.c:778)
> > > [ 995.067798][ T26] populate_rootfs (init/initramfs.c:789)
> > > [ 995.070767][ T26] do_one_initcall (init/main.c:1274)
> > > [ 995.074441][ T26] ? initramfs_async_setup (init/initramfs.c:782)
> > > [ 995.098384][ T26] do_initcalls (init/main.c:1335 init/main.c:1352)
> > > [ 995.136744][ T26] kernel_init_freeable (init/main.c:1588)
> > > [ 995.136744][ T26] ? rest_init (init/main.c:1466)
> > > [ 995.158663][ T26] kernel_init (init/main.c:1476)
> > > [ 995.177750][ T26] ret_from_fork (arch/x86/kernel/process.c:154)
> > > [ 995.178146][ T26] ? rest_init (init/main.c:1466)
> > > [ 995.230129][ T26] ret_from_fork_asm (arch/x86/entry/entry_32.S:737)
> > > [ 995.230129][ T26] entry_INT80_32 (arch/x86/entry/entry_32.S:945)
> > > [ 995.268743][ T26]
> > > [ 995.268743][ T26] Showing all locks held in the system:
> > > [ 995.336987][ T26] 1 lock held by khungtaskd/26:
> > > [ 995.446697][ T26] #0: 830cce10 (rcu_read_lock){....}-{1:3}, at: debug_show_all_locks (kernel/locking/lockdep.c:6768 (discriminator 1))
> > > [ 995.464546][ T26] 4 locks held by kworker/u4:2/29:
> > > [ 995.470758][ T26] 2 locks held by kworker/0:3/38:
> > > [ 995.498593][ T26]
> > > [ 995.599251][ T26] =============================================
> > > [ 995.599251][ T26]
> > > [ 995.729940][ T26] Kernel panic - not syncing: hung_task: blocked tasks
> > > [ 995.729940][ T26] CPU: 0 UID: 0 PID: 26 Comm: khungtaskd Not tainted 6.16.0-rc6-00086-g8671bad873eb #1 PREEMPT(full)
> > > [ 995.729940][ T26] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
> > > [ 995.729940][ T26] Call Trace:
> > > [ 995.729940][ T26] dump_stack_lvl (lib/dump_stack.c:124)
> > > [ 995.729940][ T26] dump_stack (lib/dump_stack.c:130)
> > > [ 995.729940][ T26] panic (kernel/panic.c:382)
> > > [ 995.729940][ T26] check_hung_uninterruptible_tasks (kernel/hung_task.c:311)
> > > [ 995.729940][ T26] watchdog (kernel/hung_task.c:470)
> > > [ 995.729940][ T26] kthread (kernel/kthread.c:464)
> > > [ 995.729940][ T26] ? check_hung_uninterruptible_tasks (kernel/hung_task.c:453)
> > > [ 995.729940][ T26] ? kthread_complete_and_exit (kernel/kthread.c:413)
> > > [ 995.729940][ T26] ret_from_fork (arch/x86/kernel/process.c:154)
> > > [ 995.729940][ T26] ? kthread_complete_and_exit (kernel/kthread.c:413)
> > > [ 995.729940][ T26] ret_from_fork_asm (arch/x86/entry/entry_32.S:737)
> > > [ 995.729940][ T26] entry_INT80_32 (arch/x86/entry/entry_32.S:945)
> > >
> > >
> > >
> > > The kernel config and materials to reproduce are available at:
> > > https://download.01.org/0day-ci/archive/20250905/202509051010.e06823ab-lkp@intel.com
> > >
> > >
> > >
> > > --
> > > 0-DAY CI Kernel Test Service
> > > https://github.com/intel/lkp-tests/wiki
> > >
> > ---end quoted text---
> ---end quoted text---
>
Powered by blists - more mailing lists