[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20170614234345.GX3721@linux.vnet.ibm.com>
Date: Wed, 14 Jun 2017 16:43:45 -0700
From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To: "Luis R. Rodriguez" <mcgrof@...nel.org>
Cc: josh@...htriplett.org, rostedt@...dmis.org,
mathieu.desnoyers@...icios.com, jiangshanlai@...il.com,
paul.gortmaker@...driver.com, ebiederm@...ssion.com,
dmitry.torokhov@...il.com, linux-kernel@...r.kernel.org,
oleg@...hat.com
Subject: Re: [RFC] rcu: use killable versions of swait
On Wed, Jun 14, 2017 at 04:06:39PM -0700, Luis R. Rodriguez wrote:
> These waits don't even check for the return value for interruption,
> using the non-killable variants means we could be killed by other
> signals than SIGKILL, this is fragile.
A number of people asserted that kthreads could never catch signals.
I checked this at the time, and it seemed like this was the case, and
the call to ignore_signals() seems to confirm this.
So it looks to me like RCU doesn't care about this change (give or
take any affect on the load average, anyway), but there is much that I
don't know about Linux-kernel signal handling, so please feel free to
educate me.
> Signed-off-by: Luis R. Rodriguez <mcgrof@...nel.org>
> ---
>
> The killable swaits were just posted [1] as part of a series where SIGCHLD
> was detected as interrupting and killing kernel calls waiting using
> non-killable swaits [1]. The fragility here made curious about other callers
> and seeing if they really meant to use such broad wait which captures a lot
> of signals.
>
> I can't see why we'd want to have these killed by other signals, specialy
> since it seems we don't even check for the return value... Granted to abort
> properly we'd have to check for the return value for -ERESTARTSYS, but yeah,
> none of this is done, so it would seem we don't want fragile signals
> interrupting these ?
The later WARN_ON(signal_pending(current)) complains if a signal somehow
makes it to this task. Assuming that the signal is nonfatal, anyway.
> Also can someone confirm if the original change of to swait_event_timeout()
> from wait_event_interruptible_timeout() was actually intentional on
> synchronize_sched_expedited_wait() on commit abedf8e2419fb ("rcu: Use simple
> wait queues where possible in rcutree") ? I can't easily confirm.
This is also called from a workqueue (at least once the core_initcall()s
get done), which again should mean no signals. A WARN_ON(ret < 0)
complains if this ever changes. So it should be OK that this is
swait_event_timeout(). And expedited grace periods are supposed to
get done quickly, so effect on the load average should be negligible.
Or am I missing something?
Thanx, Paul
> [0] https://lkml.kernel.org/r/20170614222017.14653-3-mcgrof@kernel.org
> [1] https://lkml.kernel.org/r/20170614222017.14653-1-mcgrof@kernel.org
>
> kernel/rcu/tree.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 695fee7cafe0..9a8d06486a3c 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -2191,7 +2191,7 @@ static int __noreturn rcu_gp_kthread(void *arg)
> READ_ONCE(rsp->gpnum),
> TPS("reqwait"));
> rsp->gp_state = RCU_GP_WAIT_GPS;
> - swait_event_interruptible(rsp->gp_wq,
> + swait_event_killable(rsp->gp_wq,
> READ_ONCE(rsp->gp_flags) &
> RCU_GP_FLAG_INIT);
> rsp->gp_state = RCU_GP_DONE_GPS;
> @@ -2224,7 +2224,7 @@ static int __noreturn rcu_gp_kthread(void *arg)
> READ_ONCE(rsp->gpnum),
> TPS("fqswait"));
> rsp->gp_state = RCU_GP_WAIT_FQS;
> - ret = swait_event_interruptible_timeout(rsp->gp_wq,
> + ret = swait_event_killable_timeout(rsp->gp_wq,
> rcu_gp_fqs_check_wake(rsp, &gf), j);
> rsp->gp_state = RCU_GP_DOING_FQS;
> /* Locking provides needed memory barriers. */
> --
> 2.11.0
>
Powered by blists - more mailing lists