[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240710213418.GH9228@redhat.com>
Date: Wed, 10 Jul 2024 23:34:19 +0200
From: Oleg Nesterov <oleg@...hat.com>
To: Tejun Heo <tj@...nel.org>
Cc: Pavel Begunkov <asml.silence@...il.com>, io-uring@...r.kernel.org,
Jens Axboe <axboe@...nel.dk>,
Andrew Morton <akpm@...ux-foundation.org>,
Christian Brauner <brauner@...nel.org>,
Tycho Andersen <tandersen@...flix.com>,
Thomas Gleixner <tglx@...utronix.de>, linux-kernel@...r.kernel.org,
Julian Orth <ju.orth@...il.com>,
Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH 2/2] kernel: rerun task_work while freezing in
get_signal()
On 07/10, Tejun Heo wrote:
>
> Hello,
>
> On Wed, Jul 10, 2024 at 09:10:16PM +0200, Oleg Nesterov wrote:
> ...
> > If nothing else. CRIU needs to attach and make this task TASK_TRACED, right?
>
> Yeah, AFAIK, that's the only way to implement check-pointing for now.
OK,
> > And once the target task is traced, it won't react to task_work_add(TWA_SIGNAL).
>
> I don't know how task_work is being used but the requirement would be that
> if a cgroup is frozen, task_works shouldn't be making state changes which
> can't safely be replayed (e.g. by restarting the frozen syscalls).
Well, in theory task_work can do "anything".
Of course, it can't, say, restart a frozen syscall, task_work_run() just
executes the callbacks in kernel mode and returns.
> it'd be better to freeze them together.
And I tend to agree. simply beacase do_freezer_trap() (and more users of
clear_thread_flag(TIF_SIGPENDING) + schedule(TASK_INTERRUPTIBLE) pattern)
do not take TIF_NOTIFY_SIGNAL into account.
But how do you think this patch can make the things worse wrt CRIU ?
And let's even forget this patch which fixes the real problem.
How do you think the fact that the task sleeping in do_freezer_trap()
can react to TIF_NOTIFY_SIGNAL, call task_work_run(), and then sleep
in do_freezer_trap() again can make any difference in this sense?
> As this thing is kinda difficult to reason about,
Agreed,
> it'd probably be easier to just freeze them together if we can.
Agreed, but this needs some "generic" changes while Pavel needs a
simple and backportable workaround to suppress a real problem.
In short, I don't like this patch either, I just don't see a better
solution for now ;)
Thanks,
Oleg.
Powered by blists - more mailing lists