lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110921123354.GC8501@tiehlicka.suse.cz>
Date:	Wed, 21 Sep 2011 14:33:56 +0200
From:	Michal Hocko <mhocko@...e.cz>
To:	Johannes Weiner <jweiner@...hat.com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	Daisuke Nishimura <nishimura@....nes.nec.co.jp>,
	Balbir Singh <bsingharora@...il.com>,
	Ying Han <yinghan@...gle.com>,
	Greg Thelen <gthelen@...gle.com>,
	Michel Lespinasse <walken@...gle.com>,
	Rik van Riel <riel@...hat.com>,
	Minchan Kim <minchan.kim@...il.com>,
	Christoph Hellwig <hch@...radead.org>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org
Subject: Re: [patch 07/11] mm: vmscan: convert unevictable page rescue
 scanner to per-memcg LRU lists

On Mon 12-09-11 12:57:24, Johannes Weiner wrote:
> The global per-zone LRU lists are about to go away on memcg-enabled
> kernels, the unevictable page rescue scanner must be able to find its
> pages on the per-memcg LRU lists.
> 
> Signed-off-by: Johannes Weiner <jweiner@...hat.com>

The patch is correct but I guess the original implementation of
scan_zone_unevictable_pages is buggy (see bellow). This should be
addressed separatelly, though.

Reviewed-by: Michal Hocko <mhocko@...e.cz>

> ---
>  include/linux/memcontrol.h |    3 ++
>  mm/memcontrol.c            |   11 ++++++++
>  mm/vmscan.c                |   61 ++++++++++++++++++++++++++++---------------
>  3 files changed, 54 insertions(+), 21 deletions(-)
> 
[...]
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
[...]
> @@ -3490,32 +3501,40 @@ void scan_mapping_unevictable_pages(struct address_space *mapping)
>  #define SCAN_UNEVICTABLE_BATCH_SIZE 16UL /* arbitrary lock hold batch size */
>  static void scan_zone_unevictable_pages(struct zone *zone)
>  {
> -	struct list_head *l_unevictable = &zone->lru[LRU_UNEVICTABLE].list;
> -	unsigned long scan;
> -	unsigned long nr_to_scan = zone_page_state(zone, NR_UNEVICTABLE);
> -
> -	while (nr_to_scan > 0) {
> -		unsigned long batch_size = min(nr_to_scan,
> -						SCAN_UNEVICTABLE_BATCH_SIZE);
> -
> -		spin_lock_irq(&zone->lru_lock);
> -		for (scan = 0;  scan < batch_size; scan++) {
> -			struct page *page = lru_to_page(l_unevictable);
> +	struct mem_cgroup *mem;
>  
> -			if (!trylock_page(page))
> -				continue;
> +	mem = mem_cgroup_iter(NULL, NULL, NULL);
> +	do {
> +		struct mem_cgroup_zone mz = {
> +			.mem_cgroup = mem,
> +			.zone = zone,
> +		};
> +		unsigned long nr_to_scan;
>  
> -			prefetchw_prev_lru_page(page, l_unevictable, flags);
> +		nr_to_scan = zone_nr_lru_pages(&mz, LRU_UNEVICTABLE);
> +		while (nr_to_scan > 0) {
> +			unsigned long batch_size;
> +			unsigned long scan;
>  
> -			if (likely(PageLRU(page) && PageUnevictable(page)))
> -				check_move_unevictable_page(page, zone);
> +			batch_size = min(nr_to_scan,
> +					 SCAN_UNEVICTABLE_BATCH_SIZE);
> +			spin_lock_irq(&zone->lru_lock);
> +			for (scan = 0; scan < batch_size; scan++) {
> +				struct page *page;
>  
> -			unlock_page(page);
> +				page = lru_tailpage(&mz, LRU_UNEVICTABLE);
> +				if (!trylock_page(page))
> +					continue;

We are not moving to the next page so we will try it again in the next
round while we already increased the scan count. In the end we will
missed some pages.

> +				if (likely(PageLRU(page) &&
> +					   PageUnevictable(page)))
> +					check_move_unevictable_page(page, zone);
> +				unlock_page(page);
> +			}
> +			spin_unlock_irq(&zone->lru_lock);
> +			nr_to_scan -= batch_size;
>  		}
> -		spin_unlock_irq(&zone->lru_lock);
> -
> -		nr_to_scan -= batch_size;
> -	}
> +		mem = mem_cgroup_iter(NULL, mem, NULL);
> +	} while (mem);
>  }
>  
>  
> -- 
> 1.7.6
> 

-- 
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ