linux-kernel - Re: Fw: [lkp-developer] [sched,rcu] cf7a2dca60: [No primary change] +186% will-it-scale.time.involuntary_context

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20161214173923.GA16763@dhcp22.suse.cz>
Date:   Wed, 14 Dec 2016 18:39:24 +0100
From:   Michal Hocko <mhocko@...nel.org>
To:     "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc:     linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        peterz@...radead.org
Subject: Re: Fw: [lkp-developer] [sched,rcu]  cf7a2dca60: [No primary change]
 +186% will-it-scale.time.involuntary_context_switches

On Wed 14-12-16 08:48:27, Paul E. McKenney wrote:
> On Wed, Dec 14, 2016 at 05:15:41PM +0100, Michal Hocko wrote:
> > On Wed 14-12-16 03:06:09, Paul E. McKenney wrote:
> > > On Wed, Dec 14, 2016 at 10:54:25AM +0100, Michal Hocko wrote:
> > > > On Tue 13-12-16 07:14:08, Paul E. McKenney wrote:
> > > > > Just FYI for the moment...
> > > > > 
> > > > > So even with the slowed-down checking, making cond_resched() do what
> > > > > cond_resched_rcu_qs() does results in a smallish but quite measurable
> > > > > degradation according to 0day.
> > > > 
> > > > So if I understand those results properly, the reason seems to be the
> > > > increased involuntary context switches, right? Or am I misreading the
> > > > data?
> > > > I am looking at your "sched,rcu: Make cond_resched() provide RCU
> > > > quiescent state" in linux-next and I am wondering whether rcu_all_qs has
> > > > to be called unconditionally and not only when should_resched failed few
> > > > times? I guess you have discussed that with Peter already but do not
> > > > remember the outcome.
> > > 
> > > My first thought is to wait for the grace period to age further before
> > > checking, the idea being to avoid increasing cond_resched() overhead
> > > any further.  But if that doesn't work, then yes, I may have to look at
> > > adding more checks to cond_resched().
> > 
> > This might be really naive but would something like the following work?
> > The overhead should be pretty much negligible, I guess. Ideally the pcp
> > variable could be set somewhere from check_cpu_stall() but I couldn't
> > wrap my head around that code to see how exactly.
> 
> My concern (perhaps misplaced) with this approach is that there are
> quite a few tight loops containing cond_resched().  So I would still
> need to throttle the resulting grace-period acceleration to keep the
> context switches down to a dull roar.

Yes, I see your point. Something based on the stall timeout would be
much better of course. I just failed to come up with something that
would make sense. This was more my lack of familiarity with the code so
I hope you will be more successful ;)
-- 
Michal Hocko
SUSE Labs