[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <YUqwkXkIu9Wx+Btg@piliu.users.ipa.redhat.com>
Date: Wed, 22 Sep 2021 12:26:57 +0800
From: Pingfan Liu <piliu@...hat.com>
To: Petr Mladek <pmladek@...e.com>
Cc: Pingfan Liu <kernelfans@...il.com>, linux-kernel@...r.kernel.org,
Sumit Garg <sumit.garg@...aro.org>,
Catalin Marinas <catalin.marinas@....com>,
Will Deacon <will@...nel.org>, Ingo Molnar <mingo@...hat.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Jiri Olsa <jolsa@...hat.com>,
Namhyung Kim <namhyung@...nel.org>,
Marc Zyngier <maz@...nel.org>,
Julien Thierry <jthierry@...hat.com>,
Kees Cook <keescook@...omium.org>,
Masahiro Yamada <masahiroy@...nel.org>,
Sami Tolvanen <samitolvanen@...gle.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Wang Qing <wangqing@...o.com>,
"Peter Zijlstra (Intel)" <peterz@...radead.org>,
Santosh Sivaraj <santosh@...six.org>
Subject: Re: [PATCH 3/5] kernel/watchdog: adapt the watchdog_hld interface
for async model
On Mon, Sep 20, 2021 at 10:20:46AM +0200, Petr Mladek wrote:
> On Fri 2021-09-17 23:41:31, Pingfan Liu wrote:
[...]
> >
> > I had thought about queue_work_on() in watchdog_nmi_enable(). But since
> > this work will block the worker kthread for this cpu. So finally,
> > another worker kthread should be created for other work.
>
> This is not a problem. workqueues use a pool of workers that are
> already created and can be used when one worker gets blocked.
>
Yes, you are right. The creation is dynamic and immune to blocking.
> > But now, I think queue_work_on() may be more neat.
> >
> > > must wait in a loop until someone else stop it and read
> > > the exit code.
> > >
> > Is this behavior mandotory? Since this kthread can decide the exit
> > condition by itself.
>
> I am pretty sure. Unfortunately, I can't find it in the documentation.
>
> My view is the following. Each process has a task_struct. The
> scheduler needs task_struct so that it can switch processes.
> The task_struct must still exist when the process exits.
> The scheduler puts the task into TASK_DEAD state.
> Another process has to read the exit code and destroy the
> task struct.
>
Thanks for bringing up this, and I have an opportunity to think about it.
The core of the problem is put_task_struct(), and who releases the
last one.
It should be: finish_task_switch()->put_task_struct_rcu_user()->delayed_put_task_struct()->put_task_struct(),
if (unlikely(prev_state == TASK_DEAD)). It does not depend on another task.
> See, do_exit() in kernel/exit.c. It ends with do_dead_task().
> It is the point when the process goes into TASK_DEAD state.
>
> For a good example, see lib/test_vmalloc.c. The kthread waits
> until anyone want him to stop:
>
> static int test_func(void *private)
> {
> [...]
>
> /*
> * Wait for the kthread_stop() call.
> */
> while (!kthread_should_stop())
> msleep(10);
>
> return 0;
> }
>
> The kthreads are started and stopped in:
>
> static void do_concurrent_test(void)
> {
> [...]
> for (i = 0; i < nr_threads; i++) {
> [...]
> t->task = kthread_run(test_func, t, "vmalloc_test/%d", i);
> [...]
> /*
> * Sleep quiet until all workers are done with 1 second
> * interval. Since the test can take a lot of time we
> * can run into a stack trace of the hung task. That is
> * why we go with completion_timeout and HZ value.
> */
> do {
> ret = wait_for_completion_timeout(&test_all_done_comp, HZ);
> } while (!ret);
> [...]
> for (i = 0; i < nr_threads; i++) {
> [...]
> if (!IS_ERR(t->task))
> kthread_stop(t->task);
> [...]
> }
They are good and elegant examples.
>
>
> You do not have to solve this if you use the system workqueue
> (system_wq).
>
Yes, workqueue is a better choice.
Thanks for your great patience.
Regards,
Pingfan
Powered by blists - more mailing lists