linux-kernel - Re: [RFC] Make need_resched() return true when rcu_urgent

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1531127935.18697.57.camel@infradead.org>
Date:   Mon, 09 Jul 2018 10:18:55 +0100
From:   David Woodhouse <dwmw2@...radead.org>
To:     Peter Zijlstra <peterz@...radead.org>,
        "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc:     mhillenb@...zon.de, linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [RFC] Make need_resched() return true when rcu_urgent_qs
 requested



On Mon, 2018-07-09 at 10:53 +0200, Peter Zijlstra wrote:
> On Fri, Jul 06, 2018 at 10:11:50AM -0700, Paul E. McKenney wrote:
> > On Fri, Jul 06, 2018 at 06:29:05PM +0200, Peter Zijlstra wrote:
> > > On Fri, Jul 06, 2018 at 03:53:30PM +0100, David Woodhouse wrote:
> > > > 
> > > > diff --git a/include/linux/sched.h b/include/linux/sched.h
> > > > index e4d4e60..89f5814 100644
> > > > --- a/include/linux/sched.h
> > > > +++ b/include/linux/sched.h
> > > > @@ -1616,7 +1616,8 @@ static inline int spin_needbreak(spinlock_t *lock)
> > > >  
> > > >  static __always_inline bool need_resched(void)
> > > >  {
> > > > -	return unlikely(tif_need_resched());
> > > > +	return unlikely(tif_need_resched()) ||
> > > > +		rcu_urgent_qs_requested();
> > > >  }
> > > Instead of making need_resched() touch two cachelines, I think I would
> > > prefer adding resched_cpu() to rcu_request_urgent_qs_task().
>
> > I used to do something like this, but decided that whacking each holdout
> > CPU over the head ten times a second was a bit much.
>
> This is only called from the !list_empty(rcu_tasks_holdout) loop in
> rcu_tasks_kthread afaict, and that has a
> schedule_timeout_interruptible(HZ) in it, which I read as once a second.
> 
> Which seems like an entirely reasonable amount of time to kick a task.
> Not scheduling for a second is like an eternity.

If that is our only "fix" for KVM, then wouldn't that mean that things
like expand_fdtable() would be *expected* to take "an eternity" when
another CPU happens to be in the guest? Because vcpu_run() would still
loop until the task gets kicked after a second?

Of course, we can explicitly put a check into the KVM loop, but that
brings me back to my original concern — why is it OK to do it there as
a special case and not for the general case construct of
if (need_resched) { drop_local_locks(); cond_resched(); get_local_locks(); }

Download attachment "smime.p7s" of type "application/x-pkcs7-signature" (5213 bytes)