lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 26 Nov 2010 11:11:22 +0000
From:	Mel Gorman <mel@....ul.ie>
To:	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
Cc:	Simon Kirby <sim@...tway.ca>, Shaohua Li <shaohua.li@...el.com>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	Dave Hansen <dave@...ux.vnet.ibm.com>
Subject: Re: Free memory never fully used, swapping

On Fri, Nov 26, 2010 at 08:03:04PM +0900, KOSAKI Motohiro wrote:
> Two points.
> 
> > @@ -2310,10 +2324,12 @@ loop_again:
> >  				 * spectulatively avoid congestion waits
> >  				 */
> >  				zone_clear_flag(zone, ZONE_CONGESTED);
> > +				if (i <= pgdat->high_zoneidx)
> > +					any_zone_ok = 1;
> >  			}
> >  
> >  		}
> > -		if (all_zones_ok)
> > +		if (all_zones_ok || (order && any_zone_ok))
> >  			break;		/* kswapd: all done */
> >  		/*
> >  		 * OK, kswapd is getting into trouble.  Take a nap, then take
> > @@ -2336,7 +2352,7 @@ loop_again:
> >  			break;
> >  	}
> >  out:
> > -	if (!all_zones_ok) {
> > +	if (!(all_zones_ok || (order && any_zone_ok))) {
> 
> This doesn't work ;)
> kswapd have to clear ZONE_CONGESTED flag before enter sleeping.
> otherwise nobody can clear it.
> 

Does it not do it earlier in balance_pgdat() here

                                /*
                                 * If a zone reaches its high watermark,
                                 * consider it to be no longer congested. It's
                                 * possible there are dirty pages backed by
                                 * congested BDIs but as pressure is
                                 * relieved, spectulatively avoid congestion waits
                                 */
                                zone_clear_flag(zone, ZONE_CONGESTED);
                                if (i <= pgdat->high_zoneidx)
                                        any_zone_ok = 1;

> Say, we have to fill below condition.
>  - All zone are successing zone_watermark_ok(order-0)

We should loop around at least once with order == 0 where all_zones_ok
is checked.

>  - At least one zone are successing zone_watermark_ok(high-order)
> 

This is preferable but it's possible for kswapd to go to sleep without
this condition being satisified.

> 
> 
> > @@ -2417,6 +2439,7 @@ static int kswapd(void *p)
> >  		prepare_to_wait(&pgdat->kswapd_wait, &wait, TASK_INTERRUPTIBLE);
> >  		new_order = pgdat->kswapd_max_order;
> >  		pgdat->kswapd_max_order = 0;
> > +		pgdat->high_zoneidx = MAX_ORDER;
> 
> I don't think MAX_ORDER is correct ;)
> 
>         high_zoneidx = pgdat->high_zoneidx;
>         pgdat->high_zoneidx = pgdat->nr_zones - 1;
> 
> ?
> 

Bah. It should have been MAX_NR_ZONES. This happens to still work because
MAX_ORDER will always be higher than MAX_NR_ZONES but it's wrong.

> And, we have another kswapd_max_order reading place. (after kswapd_try_to_sleep)
> We need it too.
> 

I'm not quite sure what you mean here. kswapd_max_order is read again
after kswapd tries to sleep (or wakes for that matter) but it'll be in
response to another caller having tried to wake kswapd indicating that
those high orders really are needed.

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ