lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aMoCYelz87V8bSzA@uudg.org>
Date: Tue, 16 Sep 2025 21:35:45 -0300
From: "Luis Claudio R. Goncalves" <lgoncalv@...hat.com>
To: kernel test robot <oliver.sang@...el.com>
Cc: oe-lkp@...ts.linux.dev, lkp@...el.com, linux-kernel@...r.kernel.org,
	Peter Zijlstra <peterz@...radead.org>,
	Crystal Wood <crwood@...hat.com>,
	Wander Lairson Costa <wander@...hat.com>,
	Valentin Schneider <vschneid@...hat.com>,
	Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
	aubrey.li@...ux.intel.com, yu.c.chen@...el.com
Subject: Re: [linus:master] [sched]  8671bad873:
 INFO:task_blocked_for_more_than#seconds

On Mon, Sep 08, 2025 at 02:06:37PM -0300, Luis Claudio R. Goncalves wrote:
> On Fri, Sep 05, 2025 at 10:49:35AM +0800, kernel test robot wrote:
> > 
> > 
> > Hello,
> > 
> > kernel test robot noticed "INFO:task_blocked_for_more_than#seconds" on:
> > 
> > commit: 8671bad873ebeb082afcf7b4501395c374da6023 ("sched: Do not call __put_task_struct() on rt if pi_blocked_on is set")
> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> > 
> > [test failed on      linus/master 08b06c30a44555a8b1d14950e4462a52bfa0758b]
> > [test failed on linux-next/master 5d50cf9f7cf20a17ac469c20a2e07c29c1f6aab7]
> > 
> > in testcase: rcutorture
> > version: 
> > with following parameters:
> > 
> > 	runtime: 300s
> > 	test: cpuhotplug
> > 	torture_type: tasks-rude
> 
> I ran tests with the boxes I had at hand, x86_64 and arm64, and was unable
> to reproduce the problem. Tomorrow, when I am back from a holiday,, I will
> try to reproduce the problem with x86 (32 bits) VMs and x86 baremetal as it
> seems to be the case on the report.

I have been trying to reproduce the problem for a week now, on both
baremetal and VMs, x86 32 bits, without success. I tried to limit the
amount of CPUs and memory, to mimic as well as possible the test
environment but that has not changed the test results at all.

Are there any other pointers to reproduce this problem? Other than what can
be extracted from the log excerpts available, I mean.

Best regards.
Luis

> In any case, I sent a follow-up patch that isolated those changes to
> kernels with PREEMPT_RT enabled, as initially intended. That should solve
> this case (if really caused by the commit in question). The patch I
> mentioned is:
> 
>     [RESEND PATCH] sched: restore the behavior of put_task_struct() for non-rt
>     https://lore.kernel.org/all/aKxqGLNOp2sWJwnZ@uudg.org/
> 
> Best regards,
> Luis
>  
> > 
> > 
> > config: i386-randconfig-017-20250830
> > compiler: gcc-12
> > test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
> > 
> > (please refer to attached dmesg/kmsg for entire log/backtrace)
> > 
> > 
> > the issue happens randomly and we observed various issues in tests by this
> > commit, while parent keeps clean.
> > 
> > =========================================================================================
> > tbox_group/testcase/rootfs/kconfig/compiler/runtime/test/torture_type:
> >   vm-snb/rcutorture/debian-11.1-i386-20220923.cgz/i386-randconfig-017-20250830/gcc-12/300s/cpuhotplug/tasks-rude
> > 
> > 7de9d4f946383f48 8671bad873ebeb082afcf7b4501
> > ---------------- ---------------------------
> >        fail:runs  %reproduction    fail:runs
> >            |             |             |
> >            :200         16%          32:200   dmesg.BUG:kernel_hang_in_boot_stage
> >            :200          0%           1:200   dmesg.BUG:soft_lockup-CPU##stuck_for#s![swapper:#]
> >            :200          1%           2:200   dmesg.BUG:workqueue_lockup-pool
> >            :200          0%           1:200   dmesg.EIP:kthread_affine_preferred
> >            :200          0%           1:200   dmesg.EIP:lock_release
> >            :200          0%           1:200   dmesg.EIP:tick_clock_notify
> >            :200         12%          23:200   dmesg.INFO:task_blocked_for_more_than#seconds
> >            :200         12%          23:200   dmesg.Kernel_panic-not_syncing:hung_task:blocked_tasks
> >            :200          0%           1:200   dmesg.Kernel_panic-not_syncing:softlockup:hung_tasks
> >            :200          0%           1:200   dmesg.WARNING:at_kernel/kthread.c:#kthread_affine_preferred
> > 
> > 
> > 
> > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> > the same patch/commit), kindly add following tags
> > | Reported-by: kernel test robot <oliver.sang@...el.com>
> > | Closes: https://lore.kernel.org/oe-lkp/202509051010.e06823ab-lkp@intel.com
> > 
> > 
> > [  994.935251][   T26] INFO: task swapper/0:1 blocked for more than 491 seconds.
> > [  994.947414][   T26]       Not tainted 6.16.0-rc6-00086-g8671bad873eb #1
> > [  994.951523][   T26] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > [  994.960576][   T26] task:swapper/0       state:D stack:5016  pid:1     tgid:1     ppid:0      task_flags:0x0140 flags:0x00004000
> > [  994.972581][   T26] Call Trace:
> > [ 994.998429][ T26] __schedule (kernel/sched/core.c:5354 kernel/sched/core.c:6954) 
> > [ 995.035758][ T26] schedule (kernel/sched/core.c:7037 kernel/sched/core.c:7051) 
> > [ 995.044863][ T26] async_synchronize_cookie_domain (kernel/async.c:317 kernel/async.c:310) 
> > [ 995.050698][ T26] ? do_wait_intr (kernel/sched/wait.c:384) 
> > [ 995.059798][ T26] wait_for_initramfs (init/initramfs.c:778) 
> > [ 995.067798][ T26] populate_rootfs (init/initramfs.c:789) 
> > [ 995.070767][ T26] do_one_initcall (init/main.c:1274) 
> > [ 995.074441][ T26] ? initramfs_async_setup (init/initramfs.c:782) 
> > [ 995.098384][ T26] do_initcalls (init/main.c:1335 init/main.c:1352) 
> > [ 995.136744][ T26] kernel_init_freeable (init/main.c:1588) 
> > [ 995.136744][ T26] ? rest_init (init/main.c:1466) 
> > [ 995.158663][ T26] kernel_init (init/main.c:1476) 
> > [ 995.177750][ T26] ret_from_fork (arch/x86/kernel/process.c:154) 
> > [ 995.178146][ T26] ? rest_init (init/main.c:1466) 
> > [ 995.230129][ T26] ret_from_fork_asm (arch/x86/entry/entry_32.S:737) 
> > [ 995.230129][ T26] entry_INT80_32 (arch/x86/entry/entry_32.S:945) 
> > [  995.268743][   T26]
> > [  995.268743][   T26] Showing all locks held in the system:
> > [  995.336987][   T26] 1 lock held by khungtaskd/26:
> > [ 995.446697][ T26] #0: 830cce10 (rcu_read_lock){....}-{1:3}, at: debug_show_all_locks (kernel/locking/lockdep.c:6768 (discriminator 1)) 
> > [  995.464546][   T26] 4 locks held by kworker/u4:2/29:
> > [  995.470758][   T26] 2 locks held by kworker/0:3/38:
> > [  995.498593][   T26]
> > [  995.599251][   T26] =============================================
> > [  995.599251][   T26]
> > [  995.729940][   T26] Kernel panic - not syncing: hung_task: blocked tasks
> > [  995.729940][   T26] CPU: 0 UID: 0 PID: 26 Comm: khungtaskd Not tainted 6.16.0-rc6-00086-g8671bad873eb #1 PREEMPT(full)
> > [  995.729940][   T26] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
> > [  995.729940][   T26] Call Trace:
> > [ 995.729940][ T26] dump_stack_lvl (lib/dump_stack.c:124) 
> > [ 995.729940][ T26] dump_stack (lib/dump_stack.c:130) 
> > [ 995.729940][ T26] panic (kernel/panic.c:382) 
> > [ 995.729940][ T26] check_hung_uninterruptible_tasks (kernel/hung_task.c:311) 
> > [ 995.729940][ T26] watchdog (kernel/hung_task.c:470) 
> > [ 995.729940][ T26] kthread (kernel/kthread.c:464) 
> > [ 995.729940][ T26] ? check_hung_uninterruptible_tasks (kernel/hung_task.c:453) 
> > [ 995.729940][ T26] ? kthread_complete_and_exit (kernel/kthread.c:413) 
> > [ 995.729940][ T26] ret_from_fork (arch/x86/kernel/process.c:154) 
> > [ 995.729940][ T26] ? kthread_complete_and_exit (kernel/kthread.c:413) 
> > [ 995.729940][ T26] ret_from_fork_asm (arch/x86/entry/entry_32.S:737) 
> > [ 995.729940][ T26] entry_INT80_32 (arch/x86/entry/entry_32.S:945) 
> > 
> > 
> > 
> > The kernel config and materials to reproduce are available at:
> > https://download.01.org/0day-ci/archive/20250905/202509051010.e06823ab-lkp@intel.com
> > 
> > 
> > 
> > -- 
> > 0-DAY CI Kernel Test Service
> > https://github.com/intel/lkp-tests/wiki
> > 
> ---end quoted text---
---end quoted text---


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ