linux-kernel - Re: [PATCH] hung_task: Skip hung task detection during core dump operations

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <33f995c6-4db7-4e4c-ba12-eb5d05e8521c@linux.dev>
Date: Thu, 14 Aug 2025 11:12:52 +0800
From: Lance Yang <lance.yang@...ux.dev>
To: "Nanji Parmar (he/him)" <nparmar@...estorage.com>
Cc: mhiramat@...nel.org, linux-kernel@...r.kernel.org,
 Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH] hung_task: Skip hung task detection during core dump
 operations

Hi Nanji,

Thanks for your patch!

On 2025/8/14 06:01, Andrew Morton wrote:
> On Wed, 13 Aug 2025 11:30:36 -0700 "Nanji Parmar (he/him)" <nparmar@...estorage.com> wrote:
> 
>> Tasks involved in core dump operations can legitimately block for
>> extended periods, especially for large memory processes. The hung
>> task detector should skip tasks with PF_DUMPCORE (main dumping
>> thread) or PF_POSTCOREDUMP (other threads in the group) flags to
>> avoid false positive warnings.
>>
>> This prevents incorrect hung task reports during legitimate core
>> dump generation that can take xx minutes for large processes.
> 
> It isn't pleasing to be putting coredump special cases into the core of
> the hung-task detector.  Perhaps the hung task detector should get an

Yeah, adding a special case for coredumps is not a good design ;)

> equivalent to touch_softlockup_watchdog().  I'm surprised it doesn't
> already have such a thing.  Maybe it does and I've forgotten where it is.
> 
> Please provide a full description of the problem, mainly the relevant
> dmesg output.  Please always provide this full description when
> addressing kernel issues, thanks.

Interestingly, I wasn't able to reproduce the hung task warning on my
machine with a SSD, even when generating a 100 GiB coredump. The process
switches between R and D states so fast that it never hits the timeout,
even with hung_task_timeout_secs set as low as 5s ;)

So it seems this isn't a general problem for all coredumps. It look like
it only happens on systems with slow I/O, which can cause a process to
stay in a D-state for a long time.

Anyway, any task *actually* blocked on I/O for that long should be flagged;
that is the hung task detector's job, IMHO.

Thanks,
Lance