[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210614161221.GC68749@worktop.programming.kicks-ass.net>
Date: Mon, 14 Jun 2021 18:12:21 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Oleg Nesterov <oleg@...hat.com>
Cc: rjw@...ysocki.net, mingo@...nel.org, vincent.guittot@...aro.org,
dietmar.eggemann@....com, rostedt@...dmis.org, mgorman@...e.de,
Will Deacon <will@...nel.org>, Tejun Heo <tj@...nel.org>,
linux-kernel@...r.kernel.org, linux-pm@...r.kernel.org
Subject: Re: [PATCH] freezer,sched: Rewrite core freezer logic
On Mon, Jun 14, 2021 at 05:42:47PM +0200, Oleg Nesterov wrote:
> Hi Peter, sorry for delay,
>
> On 06/11, Peter Zijlstra wrote:
> >
> > +/* Recursion relies on tail-call optimization to not blow away the stack */
> > +static bool __frozen(struct task_struct *p)
> > +{
> > + if (p->state == TASK_FROZEN)
> > + return true;
> > +
> > + /*
> > + * If stuck in TRACED, and the ptracer is FROZEN, we're frozen too.
> > + */
> > + if (task_is_traced(p))
> > + return frozen(rcu_dereference(p->parent));
>
> Why does it use frozen(), not __frozen() ?
(because I'm an idiot :/)
> This looks racy, p->parent can resume this task and then enter
> __refrigerator().
But this is about the child, we won't report it frozen, unless the
parent is also frozen. If the parent is frozen, it cannot resume the
task.
The other way around, if the parent resumes the task and then gets
frozen, then we'll wait until the task gets frozen.
That is, I don't see the race. Maybe it's been too warm, but could you
spell it out?
> Plus this task can be SIGKILL'ed even if it is traced.
Hurmm.. *that* is a problem.
> > + /*
> > + * If stuck in STOPPED and the parent is FROZEN, we're frozen too.
> > + */
> > + if (task_is_stopped(p))
> > + return frozen(rcu_dereference(p->real_parent));
>
> (you could use ->parent in this case too and unify this check with the
> "traced" case above)
Are you sure? The way I read the code ptrace_attach() will change
->parent, but STOPPED is controlled by the jobctl.
> I don't understand. How this connects to ->parent or ->real_parent?
> SIGCONT can come from anywhere and wake this stopped task up?
Could be me who's not understanding, I thought only the real parent
could do that.
> I guess you do this to avoid freezable_schedule() in ptrace/signal_stop,
> and we can't use TASK_STOPPED|TASK_FREEZABLE, it should not run after
> thaw()... But see above, we can't rely on __frozen(parent).
I do this because freezing puts a task in TASK_FROZEN, and that cannot
preserve TAKS_STOPPED or TASK_TRACED without being subject to wakups
from those bits. I suppose I can add TASK_FROZEN_STOPPED and
TASK_FROZEN_TRACED bits. Let me try that... (tomorrow, brain is cooked).
Powered by blists - more mailing lists