lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 16 Feb 2017 09:32:32 +0000
From:   Mel Gorman <mgorman@...hsingularity.net>
To:     Hillf Danton <hillf.zj@...baba-inc.com>
Cc:     'Andrew Morton' <akpm@...ux-foundation.org>,
        'Shantanu Goel' <sgoel01@...oo.com>,
        'Chris Mason' <clm@...com>,
        'Johannes Weiner' <hannes@...xchg.org>,
        'Vlastimil Babka' <vbabka@...e.cz>,
        'LKML' <linux-kernel@...r.kernel.org>,
        'Linux-MM' <linux-mm@...ck.org>
Subject: Re: [PATCH 3/3] mm, vmscan: Prevent kswapd sleeping prematurely due
 to mismatched classzone_idx

On Thu, Feb 16, 2017 at 04:21:04PM +0800, Hillf Danton wrote:
> 
> On February 16, 2017 4:11 PM Mel Gorman wrote:
> > On Thu, Feb 16, 2017 at 02:23:08PM +0800, Hillf Danton wrote:
> > > On February 15, 2017 5:23 PM Mel Gorman wrote:
> > > >   */
> > > >  static int kswapd(void *p)
> > > >  {
> > > > -	unsigned int alloc_order, reclaim_order, classzone_idx;
> > > > +	unsigned int alloc_order, reclaim_order;
> > > > +	unsigned int classzone_idx = MAX_NR_ZONES - 1;
> > > >  	pg_data_t *pgdat = (pg_data_t*)p;
> > > >  	struct task_struct *tsk = current;
> > > >
> > > > @@ -3447,20 +3466,23 @@ static int kswapd(void *p)
> > > >  	tsk->flags |= PF_MEMALLOC | PF_SWAPWRITE | PF_KSWAPD;
> > > >  	set_freezable();
> > > >
> > > > -	pgdat->kswapd_order = alloc_order = reclaim_order = 0;
> > > > -	pgdat->kswapd_classzone_idx = classzone_idx = 0;
> > > > +	pgdat->kswapd_order = 0;
> > > > +	pgdat->kswapd_classzone_idx = MAX_NR_ZONES;
> > > >  	for ( ; ; ) {
> > > >  		bool ret;
> > > >
> > > > +		alloc_order = reclaim_order = pgdat->kswapd_order;
> > > > +		classzone_idx = kswapd_classzone_idx(pgdat, classzone_idx);
> > > > +
> > > >  kswapd_try_sleep:
> > > >  		kswapd_try_to_sleep(pgdat, alloc_order, reclaim_order,
> > > >  					classzone_idx);
> > > >
> > > >  		/* Read the new order and classzone_idx */
> > > >  		alloc_order = reclaim_order = pgdat->kswapd_order;
> > > > -		classzone_idx = pgdat->kswapd_classzone_idx;
> > > > +		classzone_idx = kswapd_classzone_idx(pgdat, 0);
> > > >  		pgdat->kswapd_order = 0;
> > > > -		pgdat->kswapd_classzone_idx = 0;
> > > > +		pgdat->kswapd_classzone_idx = MAX_NR_ZONES;
> > > >
> > > >  		ret = try_to_freeze();
> > > >  		if (kthread_should_stop())
> > > > @@ -3486,9 +3508,6 @@ static int kswapd(void *p)
> > > >  		reclaim_order = balance_pgdat(pgdat, alloc_order, classzone_idx);
> > > >  		if (reclaim_order < alloc_order)
> > > >  			goto kswapd_try_sleep;
> > >
> > > If we fail order-5 request,  can we then give up order-5, and
> > > try order-3 if requested, after napping?
> > >
> > 
> > That has no bearing upon this patch. At this point, kswapd has stopped
> > reclaiming at the requested order and is preparing to sleep. If there is
> > a parallel request for order-3 while it's sleeping, it'll wake and start
> > reclaiming at order-3 as requested.
> > 
>
> Right, but the order-3 request can also come up while kswapd is active and
> gives up order-5.
> 

And then it'll be in pgdat->kswapd_order and be picked up on the next
wakeup. It won't be immediate but it's also unlikely to be worth picking
up immediately. The context here is that a high-order reclaim request
failed and rather keeping kswapd awake reclaiming the world, go to sleep
until another wakeup request comes in. Staying awake continually for
high orders caused problems with excessive reclaim in the past.

It could be revisited again but it's not related to what this patch is
aimed for -- avoiding reclaim going to sleep because ZONE_DMA is balanced
for a GFP_DMA request which is nowhere in the request stream.

-- 
Mel Gorman
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ