lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20091114023901.3DA8.A69D9226@jp.fujitsu.com>
Date:	Sat, 14 Nov 2009 03:00:57 +0900 (JST)
From:	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
To:	Mel Gorman <mel@....ul.ie>
Cc:	kosaki.motohiro@...fujitsu.com,
	Andrew Morton <akpm@...ux-foundation.org>,
	Frans Pop <elendil@...net.nl>, Jiri Kosina <jkosina@...e.cz>,
	Sven Geggus <lists@...hsschwanzdomain.de>,
	Karol Lewandowski <karol.k.lewandowski@...il.com>,
	Tobias Oetiker <tobi@...iker.ch>, linux-kernel@...r.kernel.org,
	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	Pekka Enberg <penberg@...helsinki.fi>,
	Rik van Riel <riel@...hat.com>,
	Christoph Lameter <cl@...ux-foundation.org>,
	Stephan von Krawczynski <skraw@...net.com>,
	"Rafael J. Wysocki" <rjw@...k.pl>,
	Kernel Testers List <kernel-testers@...r.kernel.org>
Subject: Re: [PATCH 4/5] vmscan: Have kswapd sleep for a short interval and double check it should be asleep

> On Fri, Nov 13, 2009 at 07:43:09PM +0900, KOSAKI Motohiro wrote:
> > > After kswapd balances all zones in a pgdat, it goes to sleep. In the event
> > > of no IO congestion, kswapd can go to sleep very shortly after the high
> > > watermark was reached. If there are a constant stream of allocations from
> > > parallel processes, it can mean that kswapd went to sleep too quickly and
> > > the high watermark is not being maintained for sufficient length time.
> > > 
> > > This patch makes kswapd go to sleep as a two-stage process. It first
> > > tries to sleep for HZ/10. If it is woken up by another process or the
> > > high watermark is no longer met, it's considered a premature sleep and
> > > kswapd continues work. Otherwise it goes fully to sleep.
> > > 
> > > This adds more counters to distinguish between fast and slow breaches of
> > > watermarks. A "fast" premature sleep is one where the low watermark was
> > > hit in a very short time after kswapd going to sleep. A "slow" premature
> > > sleep indicates that the high watermark was breached after a very short
> > > interval.
> > > 
> > > Signed-off-by: Mel Gorman <mel@....ul.ie>
> > 
> > Why do you submit this patch to mainline? this is debugging patch
> > no more and no less.
> > 
> 
> Do you mean the stats part? The stats are included until such time as the page
> allocator failure reports stop or are significantly reduced. In the event a
> report is received, the value of the counters help determine if kswapd was
> struggling or not. They should be removed once this mess is ironed out.
> 
> If there is a preference, I can split out the stats part and send it to
> people with page allocator failure reports for retesting.

I'm sorry my last mail didn't have enough explanation.
This stats help to solve this issue. I agreed. but after solving this issue,
I don't imagine administrator how to use this stats. if KSWAPD_PREMATURE_FAST or
KSWAPD_PREMATURE_SLOW significantly increased, what should admin do?
Or, Can LKML folk make any advise to admin?

if kernel doesn't have any bug, kswapd wakeup rate is not so worth information imho.
following your additional code itself looks good to me. but...


> ==== CUT HERE ====
> vmscan: Have kswapd sleep for a short interval and double check it should be asleep fix 1
> 
> This patch is a fix and a claritifacation to the patch "vmscan: Have
> kswapd sleep for a short interval and double check it should be asleep".
> The fix is for kswapd to only check zones in the node it is responsible
> for. The clarification is to rename two counters to better explain what is
> being counted.
> 
> Signed-off-by: Mel Gorman <mel@....ul.ie>
> --- 
>  include/linux/vmstat.h |    2 +-
>  mm/vmscan.c            |   20 +++++++++++++-------
>  mm/vmstat.c            |    4 ++--
>  3 files changed, 16 insertions(+), 10 deletions(-)
> 
> diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
> index 7d66695..0591a48 100644
> --- a/include/linux/vmstat.h
> +++ b/include/linux/vmstat.h
> @@ -40,7 +40,7 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT,
>  		PGSCAN_ZONE_RECLAIM_FAILED,
>  #endif
>  		PGINODESTEAL, SLABS_SCANNED, KSWAPD_STEAL, KSWAPD_INODESTEAL,
> -		KSWAPD_PREMATURE_FAST, KSWAPD_PREMATURE_SLOW,
> +		KSWAPD_LOW_WMARK_HIT_QUICKLY, KSWAPD_HIGH_WMARK_HIT_QUICKLY,
>  		KSWAPD_NO_CONGESTION_WAIT,
>  		PAGEOUTRUN, ALLOCSTALL, PGROTATED,
>  #ifdef CONFIG_HUGETLB_PAGE
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 70967e1..5557555 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1905,19 +1905,25 @@ unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *mem_cont,
>  #endif
>  
>  /* is kswapd sleeping prematurely? */
> -static int sleeping_prematurely(int order, long remaining)
> +static int sleeping_prematurely(pg_data_t *pgdat, int order, long remaining)
>  {
> -	struct zone *zone;
> +	int i;
>  
>  	/* If a direct reclaimer woke kswapd within HZ/10, it's premature */
>  	if (remaining)
>  		return 1;
>  
>  	/* If after HZ/10, a zone is below the high mark, it's premature */
> -	for_each_populated_zone(zone)
> +	for (i = 0; i < pgdat->nr_zones; i++) {
> +		struct zone *zone = pgdat->node_zones + i;
> +
> +		if (!populated_zone(zone))
> +			continue;
> +
>  		if (!zone_watermark_ok(zone, order, high_wmark_pages(zone),
>  								0, 0))
>  			return 1;
> +	}
>  
>  	return 0;
>  }
> @@ -2221,7 +2227,7 @@ static int kswapd(void *p)
>  				long remaining = 0;
>  
>  				/* Try to sleep for a short interval */
> -				if (!sleeping_prematurely(order, remaining)) {
> +				if (!sleeping_prematurely(pgdat, order, remaining)) {
>  					remaining = schedule_timeout(HZ/10);
>  					finish_wait(&pgdat->kswapd_wait, &wait);
>  					prepare_to_wait(&pgdat->kswapd_wait, &wait, TASK_INTERRUPTIBLE);
> @@ -2232,13 +2238,13 @@ static int kswapd(void *p)
>  				 * premature sleep. If not, then go fully
>  				 * to sleep until explicitly woken up
>  				 */
> -				if (!sleeping_prematurely(order, remaining))
> +				if (!sleeping_prematurely(pgdat, order, remaining))
>  					schedule();
>  				else {
>  					if (remaining)
> -						count_vm_event(KSWAPD_PREMATURE_FAST);
> +						count_vm_event(KSWAPD_LOW_WMARK_HIT_QUICKLY);
>  					else
> -						count_vm_event(KSWAPD_PREMATURE_SLOW);
> +						count_vm_event(KSWAPD_HIGH_WMARK_HIT_QUICKLY);
>  				}
>  			}
>  
> diff --git a/mm/vmstat.c b/mm/vmstat.c
> index bc09547..6cc8dc6 100644
> --- a/mm/vmstat.c
> +++ b/mm/vmstat.c
> @@ -683,8 +683,8 @@ static const char * const vmstat_text[] = {
>  	"slabs_scanned",
>  	"kswapd_steal",
>  	"kswapd_inodesteal",
> -	"kswapd_slept_prematurely_fast",
> -	"kswapd_slept_prematurely_slow",
> +	"kswapd_low_wmark_hit_quickly",
> +	"kswapd_high_wmark_hit_quickly",
>  	"kswapd_no_congestion_wait",
>  	"pageoutrun",
>  	"allocstall",



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ