lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 12 Dec 2022 12:35:57 -0800 (PST)
From:   Hugh Dickins <hughd@...gle.com>
To:     Greg Kroah-Hartman <gregkh@...uxfoundation.org>
cc:     stable@...r.kernel.org, patches@...ts.linux.dev,
        Alex Shi <alex.shi@...ux.alibaba.com>,
        Hugh Dickins <hughd@...gle.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Vlastimil Babka <vbabka@...e.cz>,
        "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
        Alexander Duyck <alexander.duyck@...il.com>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Andrey Ryabinin <aryabinin@...tuozzo.com>,
        "Chen, Rong A" <rong.a.chen@...el.com>,
        Daniel Jordan <daniel.m.jordan@...cle.com>,
        "Huang, Ying" <ying.huang@...el.com>, Jann Horn <jannh@...gle.com>,
        Joonsoo Kim <iamjoonsoo.kim@....com>,
        "Kirill A. Shutemov" <kirill@...temov.name>,
        Konstantin Khlebnikov <khlebnikov@...dex-team.ru>,
        "Matthew Wilcox (Oracle)" <willy@...radead.org>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Michal Hocko <mhocko@...nel.org>,
        Michal Hocko <mhocko@...e.com>,
        Mika Penttilä <mika.penttila@...tfour.com>,
        Minchan Kim <minchan@...nel.org>,
        Shakeel Butt <shakeelb@...gle.com>, Tejun Heo <tj@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Vladimir Davydov <vdavydov.dev@...il.com>,
        Wei Yang <richard.weiyang@...il.com>,
        Yang Shi <yang.shi@...ux.alibaba.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Sasha Levin <sashal@...nel.org>, Gavin Shan <gshan@...hat.com>,
        Zhenyu Zhang <zhenyzha@...hat.com>,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH 5.10 001/106] mm/mlock: remove lru_lock on
 TestClearPageMlocked

On Mon, 12 Dec 2022, Greg Kroah-Hartman wrote:

> From: Alex Shi <alex.shi@...ux.alibaba.com>
> 
> [ Upstream commit 3db19aa39bac33f2e850fa1ddd67be29b192e51f ]
> 
> In the func munlock_vma_page, comments mentained lru_lock needed for
> serialization with split_huge_pages.  But the page must be PageLocked as
> well as pages in split_huge_page series funcs.  Thus the PageLocked is
> enough to serialize both funcs.
> 
> Further more, Hugh Dickins pointed: before splitting in
> split_huge_page_to_list, the page was unmap_page() to remove pmd/ptes
> which protect the page from munlock.  Thus, no needs to guard
> __split_huge_page_tail for mlock clean, just keep the lru_lock there for
> isolation purpose.
> 
> LKP found a preempt issue on __mod_zone_page_state which need change to
> mod_zone_page_state.  Thanks!
> 
> Link: https://lkml.kernel.org/r/1604566549-62481-13-git-send-email-alex.shi@linux.alibaba.com
> Signed-off-by: Alex Shi <alex.shi@...ux.alibaba.com>
> Acked-by: Hugh Dickins <hughd@...gle.com>
> Acked-by: Johannes Weiner <hannes@...xchg.org>
> Acked-by: Vlastimil Babka <vbabka@...e.cz>
> Cc: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>
> Cc: Alexander Duyck <alexander.duyck@...il.com>
> Cc: Andrea Arcangeli <aarcange@...hat.com>
> Cc: Andrey Ryabinin <aryabinin@...tuozzo.com>
> Cc: "Chen, Rong A" <rong.a.chen@...el.com>
> Cc: Daniel Jordan <daniel.m.jordan@...cle.com>
> Cc: "Huang, Ying" <ying.huang@...el.com>
> Cc: Jann Horn <jannh@...gle.com>
> Cc: Joonsoo Kim <iamjoonsoo.kim@....com>
> Cc: Kirill A. Shutemov <kirill@...temov.name>
> Cc: Konstantin Khlebnikov <khlebnikov@...dex-team.ru>
> Cc: Matthew Wilcox (Oracle) <willy@...radead.org>
> Cc: Mel Gorman <mgorman@...hsingularity.net>
> Cc: Michal Hocko <mhocko@...nel.org>
> Cc: Michal Hocko <mhocko@...e.com>
> Cc: Mika Penttilä <mika.penttila@...tfour.com>
> Cc: Minchan Kim <minchan@...nel.org>
> Cc: Shakeel Butt <shakeelb@...gle.com>
> Cc: Tejun Heo <tj@...nel.org>
> Cc: Thomas Gleixner <tglx@...utronix.de>
> Cc: Vladimir Davydov <vdavydov.dev@...il.com>
> Cc: Wei Yang <richard.weiyang@...il.com>
> Cc: Yang Shi <yang.shi@...ux.alibaba.com>
> Signed-off-by: Andrew Morton <akpm@...ux-foundation.org>
> Signed-off-by: Linus Torvalds <torvalds@...ux-foundation.org>
> Stable-dep-of: 829ae0f81ce0 ("mm: migrate: fix THP's mapcount on isolation")
> Signed-off-by: Sasha Levin <sashal@...nel.org>

NAK from me to patches 001 through 007 here: 001 through 006 are a
risky subset of patches and followups to a per-memcg per-node lru_lock
series from Alex Shi, which made subtle changes to locking, memcg
charging, lru management, page migration etc.

The whole series could be backported to 5.10 (I did so myself for
internal usage), but cherry-picking parts of it into 5.10-stable is
misguided and contrary to stable principles.

Maybe there is in fact nothing wrong with the selection made:
but then give linux-mm guys two or three weeks to review and
test and give the thumbs up to that selection.

Much easier, quicker and safer would be to adjust 007 (I presume
the reason behind 001 through 006) to fit the 5.10-stable tree:
I can do that myself if you ask, but not until later this week.

Hugh

> ---
>  mm/mlock.c | 26 +++++---------------------
>  1 file changed, 5 insertions(+), 21 deletions(-)
> 
> diff --git a/mm/mlock.c b/mm/mlock.c
> index 884b1216da6a..796c726a0407 100644
> --- a/mm/mlock.c
> +++ b/mm/mlock.c
> @@ -187,40 +187,24 @@ static void __munlock_isolation_failed(struct page *page)
>  unsigned int munlock_vma_page(struct page *page)
>  {
>  	int nr_pages;
> -	pg_data_t *pgdat = page_pgdat(page);
>  
>  	/* For try_to_munlock() and to serialize with page migration */
>  	BUG_ON(!PageLocked(page));
> -
>  	VM_BUG_ON_PAGE(PageTail(page), page);
>  
> -	/*
> -	 * Serialize with any parallel __split_huge_page_refcount() which
> -	 * might otherwise copy PageMlocked to part of the tail pages before
> -	 * we clear it in the head page. It also stabilizes thp_nr_pages().
> -	 */
> -	spin_lock_irq(&pgdat->lru_lock);
> -
>  	if (!TestClearPageMlocked(page)) {
>  		/* Potentially, PTE-mapped THP: do not skip the rest PTEs */
> -		nr_pages = 1;
> -		goto unlock_out;
> +		return 0;
>  	}
>  
>  	nr_pages = thp_nr_pages(page);
> -	__mod_zone_page_state(page_zone(page), NR_MLOCK, -nr_pages);
> +	mod_zone_page_state(page_zone(page), NR_MLOCK, -nr_pages);
>  
> -	if (__munlock_isolate_lru_page(page, true)) {
> -		spin_unlock_irq(&pgdat->lru_lock);
> +	if (!isolate_lru_page(page))
>  		__munlock_isolated_page(page);
> -		goto out;
> -	}
> -	__munlock_isolation_failed(page);
> -
> -unlock_out:
> -	spin_unlock_irq(&pgdat->lru_lock);
> +	else
> +		__munlock_isolation_failed(page);
>  
> -out:
>  	return nr_pages - 1;
>  }
>  
> -- 
> 2.35.1
> 
> 
> 
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ