[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAB=NE6V6rf25u335QKFjqm3euvyTvJRZrjVM83QBs8t1eweh3A@mail.gmail.com>
Date: Thu, 15 Jun 2017 09:35:45 -0700
From: "Luis R. Rodriguez" <mcgrof@...nel.org>
To: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc: Josh Triplett <josh@...htriplett.org>,
Steven Rostedt <rostedt@...dmis.org>,
mathieu.desnoyers@...icios.com, jiangshanlai@...il.com,
Paul Gortmaker <paul.gortmaker@...driver.com>,
"Eric W. Biederman" <ebiederm@...ssion.com>,
Dmitry Torokhov <dmitry.torokhov@...il.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Oleg Nesterov <oleg@...hat.com>,
Peter Zijlstra <peterz@...radead.org>
Subject: Re: [RFC] rcu: use killable versions of swait
On Thu, Jun 15, 2017 at 9:22 AM, Paul E. McKenney
<paulmck@...ux.vnet.ibm.com> wrote:
> On Thu, Jun 15, 2017 at 05:50:39PM +0200, Luis R. Rodriguez wrote:
>> On Wed, Jun 14, 2017 at 04:43:45PM -0700, Paul E. McKenney wrote:
>> > On Wed, Jun 14, 2017 at 04:06:39PM -0700, Luis R. Rodriguez wrote:
>> > > These waits don't even check for the return value for interruption,
>> > > using the non-killable variants means we could be killed by other
>> > > signals than SIGKILL, this is fragile.
>> >
>> > A number of people asserted that kthreads could never catch signals.
>> > I checked this at the time, and it seemed like this was the case, and
>> > the call to ignore_signals() seems to confirm this.
>> >
>> > So it looks to me like RCU doesn't care about this change (give or
>> > take any affect on the load average, anyway), but there is much that I
>> > don't know about Linux-kernel signal handling, so please feel free to
>> > educate me.
>>
>> Thanks, I had seen the kthread but figured best to ask, just got into parnaoia
>> mode. If we were to do a sanity check for usage we'd then have to white list
>> when kthreads are used, however since we don't care to be interrupted why not
>> use a wait which is also explicit about our current uninterruptible state?
>
> I do appreciate any and all inspection, actually, so thank you!
>
> I used to have it uninterruptable, but got complaints on the effect on
> the load average -- sysadms didn't like the "D" state and the fact that
> the load average was always greater than 2 (or 3 on PREEMPT=y kernels)
> when the system was completely idle.
Ah wow, never would I have imagined this to be the reason. Are you
aware of other cases in the kernel that use interruptible wait even
though kthreads are used (therefore uninterruptible) as work around
for this? The above information is very helpful for folks doing random
code introspection, having this annotated somehow via API seems fair
to me. Would we be concerned with abuse? Would it be fair for other
kthreads to use similar strategy? If not, why?
>> > > Signed-off-by: Luis R. Rodriguez <mcgrof@...nel.org>
>> > > ---
>> > >
>> > > The killable swaits were just posted [1] as part of a series where SIGCHLD
>> > > was detected as interrupting and killing kernel calls waiting using
>> > > non-killable swaits [1]. The fragility here made curious about other callers
>> > > and seeing if they really meant to use such broad wait which captures a lot
>> > > of signals.
>> > >
>> > > I can't see why we'd want to have these killed by other signals, specialy
>> > > since it seems we don't even check for the return value... Granted to abort
>> > > properly we'd have to check for the return value for -ERESTARTSYS, but yeah,
>> > > none of this is done, so it would seem we don't want fragile signals
>> > > interrupting these ?
>> >
>> > The later WARN_ON(signal_pending(current)) complains if a signal somehow
>> > makes it to this task. Assuming that the signal is nonfatal, anyway.
>>
>> I see, how about just using swait_event_timeout() and removing the WARN_ON()?
>> Is there a reason for having the interruptible ?
>
> If sleeping-uninterruptible kthreads are now excluded from the load average,
> no reason. But if sleeping-uninterruptible kthreads are still included in
> the load average, it must stay interruptible.
Got it!
>> > > Also can someone confirm if the original change of to swait_event_timeout()
>> > > from wait_event_interruptible_timeout() was actually intentional on
>> > > synchronize_sched_expedited_wait() on commit abedf8e2419fb ("rcu: Use simple
>> > > wait queues where possible in rcutree") ? I can't easily confirm.
>> >
>> > This is also called from a workqueue (at least once the core_initcall()s
>> > get done), which again should mean no signals. A WARN_ON(ret < 0)
>> > complains if this ever changes. So it should be OK that this is
>> > swait_event_timeout(). And expedited grace periods are supposed to
>> > get done quickly, so effect on the load average should be negligible.
>>
>> Great thanks.
>>
>> > Or am I missing something?
>>
>> No, just got into parnoia mode and better to ask than be sorry later!
>
> Indeed, better safe that sorry!
:D
Luis
Powered by blists - more mailing lists