[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAJnrk1Yq5qQmCKw8rFYC=7mgBMf1+6P8c6HYKiunA88ZXNwNgg@mail.gmail.com>
Date: Thu, 8 Jan 2026 14:57:54 -0800
From: Joanne Koong <joannelkoong@...il.com>
To: Zhang Tianci <zhangtianci.1997@...edance.com>
Cc: miklos@...redi.hu, linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org, xieyongji@...edance.com,
zhujia.zj@...edance.com, Jiachen Zhang <zhangjiachen.jaycee@...edance.com>
Subject: Re: [External] Re: [PATCH] fuse: add hang check in request_wait_answer()
On Wed, Jan 7, 2026 at 6:25 PM Zhang Tianci
<zhangtianci.1997@...edance.com> wrote:
>
> Hi Joanne,
>
> > I think if the fusedaemon is in a process exit state (by "process exit
> > state", I think you're talking about the state where
> > fuse_session_exit() has been called but the daemon is stuck/hanging
> > before actual process exit?), this can still be detected in libfuse.
> > For example one idea could be libfuse spinning up a watchdog monitor
> > thread that has logic checking if the session's mt_exited has been set
> > with no progress on /sys/fs/fuse/.../waiting requests being fulfilled.
>
> The process exit state I referred to is a more severe scenario:
> the FUSEDaemon may be killed abruptly due to bugs or OOM.
> In such an unexpected exit, no userspace threads can run normally.
> However, some threads may remain stuck in the kernel and fail to exit properly.
Hmm, doesn't the CONFIG_DETECT_HUNG_TASK config already detect this?
The summary of it [1] says:
"Say Y here to enable the kernel to detect "hung tasks",
which are bugs that cause the task to be stuck in
uninterruptible "D" state indefinitely.
When a hung task is detected, the kernel will print the
current stack trace (which you should report), but the
task will stay in uninterruptible state. If lockdep is
enabled then all held locks will also be reported. This
feature has negligible overhead."
[1] https://www.kernelconfig.io/config_detect_hung_task
Another idea maybe is having some sort of system script that runs post
daemon process exit that checks if there's still any lingering d-state
children threads hanging around.
Thanks,
Joanne
>
> We have encountered at least two such cases:
>
> 1. The mount system call of the FUSEDaemon is blocked by other threads
> and cannot acquire the super_block_list lock.(Our FUSEDaemon supports
> multiple mount points, so this mount operation will affect the
> other mount points
> within the FUSEDaemon process.)
> 2. The jbd2 subsystem of the ext4 filesystem, which the FUSEDaemon
> logging system depends on, triggers a logical deadlock caused by
> priority inversion.
>
> In these instances, a userspace watchdog would be ineffective.
>
> Thanks,
> Tianci
Powered by blists - more mailing lists