lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20141222142435.GA2900@dhcp22.suse.cz>
Date:	Mon, 22 Dec 2014 15:24:35 +0100
From:	Michal Hocko <mhocko@...e.cz>
To:	Vladimir Davydov <vdavydov@...allels.com>
Cc:	Vlastimil Babka <vbabka@...e.cz>,
	Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>, stable@...r.kernel.org,
	Mel Gorman <mgorman@...e.de>,
	Johannes Weiner <hannes@...xchg.org>,
	Rik van Riel <riel@...hat.com>
Subject: Re: [PATCH 1/2] mm, vmscan: prevent kswapd livelock due to
 pfmemalloc-throttled process being killed

On Sat 20-12-14 17:18:24, Vladimir Davydov wrote:
> On Sat, Dec 20, 2014 at 11:47:46AM +0100, Michal Hocko wrote:
> > On Fri 19-12-14 21:28:15, Vladimir Davydov wrote:
> > > So AFAIU the problem does exist. However, I think it could be fixed by
> > > simply waking up all processes waiting on pfmemalloc_wait before putting
> > > kswapd to sleep:
> > 
> > I think that a simple cond_resched() in kswapd_try_to_sleep should be
> > sufficient and less risky fix, so basically what Vlastimil was proposing
> > in the beginning.
> 
> With such a solution we implicitly rely upon the scheduler
> implementation, which AFAIU is wrong.

But this is a scheduling problem, isn't it? !PREEMPT kernel with a
kernel thread looping without a scheduling point which prevents the
woken task to run due to cpu affinity...

> E.g. suppose processes are
> governed by FIFO and kswapd happens to have a higher prio than the
> process killed by OOM. Then after cond_resched kswapd will be picked for
> execution again, and the killing process won't have a chance to remove
> itself from the wait queue.

Except that kswapd runs as SCHED_NORMAL with 0 priority.

> > > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > > index 744e2b491527..2a123634c220 100644
> > > --- a/mm/vmscan.c
> > > +++ b/mm/vmscan.c
> > > @@ -2984,6 +2984,9 @@ static bool prepare_kswapd_sleep(pg_data_t *pgdat, int order, long remaining,
> > >  	if (remaining)
> > >  		return false;
> > >  
> > > +	if (!pgdat_balanced(pgdat, order, classzone_idx))
> > > +		return false;
> > > +
> > 
> > What would be consequences of not waking up pfmemalloc waiters while the
> > node is not balanced?
> 
> They will get woken up a bit later in balanced_pgdat. This might result
> in latency spikes though. In order not to change the original behaviour
> we could always wake all pfmemalloc waiters no matter if we are going to
> sleep or not:
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 744e2b491527..a21e0bd563c3 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -2993,10 +2993,7 @@ static bool prepare_kswapd_sleep(pg_data_t *pgdat, int order, long remaining,
>  	 * so wake them now if necessary. If necessary, processes will wake
>  	 * kswapd and get throttled again
>  	 */
> -	if (waitqueue_active(&pgdat->pfmemalloc_wait)) {
> -		wake_up(&pgdat->pfmemalloc_wait);
> -		return false;
> -	}
> +	wake_up_all(&pgdat->pfmemalloc_wait);
>  
>  	return pgdat_balanced(pgdat, order, classzone_idx);

So you are relying on scheduling points somewhere down the
balance_pgdat. That should be sufficient. I am still quite surprised
that we have an OOM victim still on the queue and balanced pgdat here
because OOM victim didn't have chance to free memory. So somebody else
must have released a lot of memory after OOM.

This patch seems better than the one from Vlastimil. Care to post it
with the full changelog, please?

Thanks!

>  }

-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ