linux-kernel - Re: [PATCH 06/13] mm/munlock: maintain page->mlock

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <826e69d0-c81c-06c1-c675-b54bd4557ff3@suse.cz>
Date:   Fri, 11 Feb 2022 13:27:27 +0100
From:   Vlastimil Babka <vbabka@...e.cz>
To:     Hugh Dickins <hughd@...gle.com>,
        Andrew Morton <akpm@...ux-foundation.org>
Cc:     Michal Hocko <mhocko@...e.com>,
        "Kirill A. Shutemov" <kirill@...temov.name>,
        Matthew Wilcox <willy@...radead.org>,
        David Hildenbrand <david@...hat.com>,
        Alistair Popple <apopple@...dia.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Rik van Riel <riel@...riel.com>,
        Suren Baghdasaryan <surenb@...gle.com>,
        Yu Zhao <yuzhao@...gle.com>, Greg Thelen <gthelen@...gle.com>,
        Shakeel Butt <shakeelb@...gle.com>,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH 06/13] mm/munlock: maintain page->mlock_count while
 unevictable

On 2/6/22 22:40, Hugh Dickins wrote:
> Previous patches have been preparatory: now implement page->mlock_count.
> The ordering of the "Unevictable LRU" is of no significance, and there is
> no point holding unevictable pages on a list: place page->mlock_count to
> overlay page->lru.prev (since page->lru.next is overlaid by compound_head,
> which needs to be even so as not to satisfy PageTail - though 2 could be
> added instead of 1 for each mlock, if that's ever an improvement).
> 
> But it's only safe to rely on or modify page->mlock_count while lruvec
> lock is held and page is on unevictable "LRU" - we can save lots of edits
> by continuing to pretend that there's an imaginary LRU here (there is an
> unevictable count which still needs to be maintained, but not a list).
> 
> The mlock_count technique suffers from an unreliability much like with
> page_mlock(): while someone else has the page off LRU, not much can
> be done.  As before, err on the safe side (behave as if mlock_count 0),
> and let try_to_unlock_one() move the page to unevictable if reclaim finds
> out later on - a few misplaced pages don't matter, what we want to avoid
> is imbalancing reclaim by flooding evictable lists with unevictable pages.
> 
> I am not a fan of "if (!isolate_lru_page(page)) putback_lru_page(page);":
> if we have taken lruvec lock to get the page off its present list, then
> we save everyone trouble (and however many extra atomic ops) by putting
> it on its destination list immediately.

Good point.

> Signed-off-by: Hugh Dickins <hughd@...gle.com>

Acked-by: Vlastimil Babka <vbabka@...e.cz>

> ---
>  include/linux/mm_inline.h | 11 +++++--
>  include/linux/mm_types.h  | 19 +++++++++--
>  mm/huge_memory.c          |  5 ++-
>  mm/memcontrol.c           |  3 +-
>  mm/mlock.c                | 68 +++++++++++++++++++++++++++++++--------
>  mm/mmzone.c               |  7 ++++
>  mm/swap.c                 |  1 +
>  7 files changed, 92 insertions(+), 22 deletions(-)
> 
> diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
> index b725839dfe71..884d6f6af05b 100644
> --- a/include/linux/mm_inline.h
> +++ b/include/linux/mm_inline.h
> @@ -99,7 +99,8 @@ void lruvec_add_folio(struct lruvec *lruvec, struct folio *folio)
>  
>  	update_lru_size(lruvec, lru, folio_zonenum(folio),
>  			folio_nr_pages(folio));
> -	list_add(&folio->lru, &lruvec->lists[lru]);
> +	if (lru != LRU_UNEVICTABLE)
> +		list_add(&folio->lru, &lruvec->lists[lru]);
>  }
>  
>  static __always_inline void add_page_to_lru_list(struct page *page,
> @@ -115,6 +116,7 @@ void lruvec_add_folio_tail(struct lruvec *lruvec, struct folio *folio)
>  
>  	update_lru_size(lruvec, lru, folio_zonenum(folio),
>  			folio_nr_pages(folio));
> +	/* This is not expected to be used on LRU_UNEVICTABLE */

Felt uneasy about this at first because it's just a _tail version of
lruvec_add_folio, and there's probably nothing fundamental about the users
of _tail to not encounter unevictable pages. But if the assumption is ever
violated, the poisoned list head should make it immediately clear, so I
guess that's fine.

>  	list_add_tail(&folio->lru, &lruvec->lists[lru]);
>  }
>  
> @@ -127,8 +129,11 @@ static __always_inline void add_page_to_lru_list_tail(struct page *page,
>  static __always_inline
>  void lruvec_del_folio(struct lruvec *lruvec, struct folio *folio)
>  {
> -	list_del(&folio->lru);
> -	update_lru_size(lruvec, folio_lru_list(folio), folio_zonenum(folio),
> +	enum lru_list lru = folio_lru_list(folio);
> +
> +	if (lru != LRU_UNEVICTABLE)
> +		list_del(&folio->lru);
> +	update_lru_size(lruvec, lru, folio_zonenum(folio),
>  			-folio_nr_pages(folio));
>  }
>