linux-kernel - Re: [PATCH 4/5] mm: fix check_move_unevictable

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5875bdbe-c1af-737b-9ec2-4b48b3c2465f@linux.alibaba.com>
Date:   Tue, 1 Sep 2020 10:04:13 +0800
From:   Alex Shi <alex.shi@...ux.alibaba.com>
To:     Hugh Dickins <hughd@...gle.com>,
        Andrew Morton <akpm@...ux-foundation.org>
Cc:     Johannes Weiner <hannes@...xchg.org>,
        Michal Hocko <mhocko@...e.com>,
        Mike Kravetz <mike.kravetz@...cle.com>,
        Shakeel Butt <shakeelb@...gle.com>,
        Matthew Wilcox <willy@...radead.org>, Qian Cai <cai@....pw>,
        Chris Wilson <chris@...is-wilson.co.uk>,
        Kuo-Hsin Yang <vovoy@...omium.org>,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH 4/5] mm: fix check_move_unevictable_pages() on THP



在 2020/8/31 上午5:08, Hugh Dickins 写道:
> check_move_unevictable_pages() is used in making unevictable shmem pages
> evictable: by shmem_unlock_mapping(), drm_gem_check_release_pagevec() and
> i915/gem check_release_pagevec().  Those may pass down subpages of a huge
> page, when /sys/kernel/mm/transparent_hugepage/shmem_enabled is "force".
> 
> That does not crash or warn at present, but the accounting of vmstats
> unevictable_pgs_scanned and unevictable_pgs_rescued is inconsistent:
> scanned being incremented on each subpage, rescued only on the head
> (since tails already appear evictable once the head has been updated).
> 
> 5.8 commit 5d91f31faf8e ("mm: swap: fix vmstats for huge page") has
> established that vm_events in general (and unevictable_pgs_rescued in
> particular) should count every subpage: so follow that precedent here.
> 
> Do this in such a way that if mem_cgroup_page_lruvec() is made stricter
> (to check page->mem_cgroup is always set), no problem: skip the tails
> before calling it, and add thp_nr_pages() to vmstats on the head.
> 
> Signed-off-by: Hugh Dickins <hughd@...gle.com>
> ---
> Nothing here worth going to stable, since it's just a testing config
> that is fixed, whose event numbers are not very important; but this
> will be needed before Alex Shi's warning, and might as well go in now.
> 
> The callers of check_move_unevictable_pages() could be optimized,
> to skip over tails: but Matthew Wilcox has other changes in flight
> there, so let's skip the optimization for now.
> 
>  mm/vmscan.c |   10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> --- 5.9-rc2/mm/vmscan.c	2020-08-16 17:32:50.721507348 -0700
> +++ linux/mm/vmscan.c	2020-08-28 17:47:10.595580876 -0700
> @@ -4260,8 +4260,14 @@ void check_move_unevictable_pages(struct
>  	for (i = 0; i < pvec->nr; i++) {
>  		struct page *page = pvec->pages[i];
>  		struct pglist_data *pagepgdat = page_pgdat(page);
> +		int nr_pages;
> +
> +		if (PageTransTail(page))
> +			continue;
> +
> +		nr_pages = thp_nr_pages(page);
> +		pgscanned += nr_pages;
>  
> -		pgscanned++;
>  		if (pagepgdat != pgdat) {
>  			if (pgdat)
>  				spin_unlock_irq(&pgdat->lru_lock);
> @@ -4280,7 +4286,7 @@ void check_move_unevictable_pages(struct
>  			ClearPageUnevictable(page);
>  			del_page_from_lru_list(page, lruvec, LRU_UNEVICTABLE);
>  			add_page_to_lru_list(page, lruvec, lru);

So, we might randomly del or add a thp tail page into lru?
It's interesting to know here. :)

Thanks
Alex

> -			pgrescued++;
> +			pgrescued += nr_pages;
>  		}
>  	}
>  
>