[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20150929163036.a1260608ab9b13273eebf9cb@linux-foundation.org>
Date: Tue, 29 Sep 2015 16:30:36 -0700
From: Andrew Morton <akpm@...ux-foundation.org>
To: Ulrich Obergfell <uobergfe@...hat.com>
Cc: linux-kernel@...r.kernel.org, dzickus@...hat.com,
atomlin@...hat.com
Subject: Re: [PATCH 0/5] improve handling of errors returned by
kthread_park()
On Mon, 28 Sep 2015 22:44:07 +0200 Ulrich Obergfell <uobergfe@...hat.com> wrote:
> The original watchdog_park_threads() function that was introduced by
> commit 81a4beef91ba4a9e8ad6054ca9933dff7e25ff28 takes a very simple
> approach to handle errors returned by kthread_park(): It attempts to
> roll back all watchdog threads to the unparked state. However, this
> may be undesired behaviour from the perspective of the caller which
> may want to handle errors as appropriate in its specific context.
> Currently, there are two possible call chains:
>
> - watchdog suspend/resume interface
>
> lockup_detector_suspend
> watchdog_park_threads
>
> - write to parameters in /proc/sys/kernel
>
> proc_watchdog_update
> watchdog_enable_all_cpus
> update_watchdog_all_cpus
> watchdog_park_threads
>
> Instead of 'blindly' attempting to unpark the watchdog threads if a
> kthread_park() call fails, the new approach is to disable the lockup
> detectors in the above call chains. Failure becomes visible to the
> user as follows:
>
> - error messages from lockup_detector_suspend()
> or watchdog_enable_all_cpus()
>
> - the state that can be read from /proc/sys/kernel/watchdog_enabled
>
> - the 'write' system call in the latter call chain returns an error
>
hm, you made me look at kthread parking. Why does it exist? What is a
"parked" thread anyway, and how does it differ from, say, a sleeping
one? The 2a1d446019f9a5983ec5a335b changelog is pretty useless and the
patch added no useful documentation, sigh.
Anwyay... what inspired this patchset? Are you experiencing
kthread_park() failures in practice? If so, what is causing them? And
what is the user-visible effect of these failures? This is all pretty
important context for such a patchset.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists