lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20161130115320.GO3924@linux.vnet.ibm.com>
Date:   Wed, 30 Nov 2016 03:53:20 -0800
From:   "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:     Michal Hocko <mhocko@...nel.org>
Cc:     Donald Buczek <buczek@...gen.mpg.de>,
        Paul Menzel <pmenzel@...gen.mpg.de>, dvteam@...gen.mpg.de,
        linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        Josh Triplett <josh@...htriplett.org>
Subject: Re: INFO: rcu_sched detected stalls on CPUs/tasks with `kswapd` and
 `mem_cgroup_shrink_node`

On Wed, Nov 30, 2016 at 12:09:44PM +0100, Michal Hocko wrote:
> [CCing Paul]
> 
> On Wed 30-11-16 11:28:34, Donald Buczek wrote:
> [...]
> > shrink_active_list gets and releases the spinlock and calls cond_resched().
> > This should give other tasks a chance to run. Just as an experiment, I'm
> > trying
> > 
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -1921,7 +1921,7 @@ static void shrink_active_list(unsigned long
> > nr_to_scan,
> >         spin_unlock_irq(&pgdat->lru_lock);
> > 
> >         while (!list_empty(&l_hold)) {
> > -               cond_resched();
> > +               cond_resched_rcu_qs();
> >                 page = lru_to_page(&l_hold);
> >                 list_del(&page->lru);
> > 
> > and didn't hit a rcu_sched warning for >21 hours uptime now. We'll see.
> 
> This is really interesting! Is it possible that the RCU stall detector
> is somehow confused?

No, it is not confused.  Again, cond_resched() is not a quiescent
state unless it does a context switch.  Therefore, if the task running
in that loop was the only runnable task on its CPU, cond_resched()
would -never- provide RCU with a quiescent state.

In contrast, cond_resched_rcu_qs() unconditionally provides RCU
with a quiescent state (hence the _rcu_qs in its name), regardless
of whether or not a context switch happens.

It is therefore expected behavior that this change might prevent
RCU CPU stall warnings.

							Thanx, Paul

> > Is preemption disabled for another reason?
> 
> I do not think so. I will have to double check the code but this is a
> standard sleepable context. Just wondering what is the PREEMPT
> configuration here?
> -- 
> Michal Hocko
> SUSE Labs
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ