linux-kernel - Re: [PATCH] mm: add maybe_lru_add_drain() that only drains when threshold is exceeded

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <de8c3dc8-7d36-419b-84f5-f886c6b724e6@redhat.com>
Date: Thu, 19 Dec 2024 18:23:06 +0100
From: David Hildenbrand <david@...hat.com>
To: Rik van Riel <riel@...riel.com>, Andrew Morton <akpm@...ux-foundation.org>
Cc: Chris Li <chrisl@...nel.org>, Ryan Roberts <ryan.roberts@....com>,
 "Matthew Wilcox (Oracle)" <willy@...radead.org>,
 linux-kernel@...r.kernel.org, linux-mm@...ck.org, kernel-team@...a.com
Subject: Re: [PATCH] mm: add maybe_lru_add_drain() that only drains when
 threshold is exceeded

On 19.12.24 15:11, Rik van Riel wrote:
> On Thu, 2024-12-19 at 14:47 +0100, David Hildenbrand wrote:
>>
>>> +++ b/mm/swap_state.c
>>> @@ -317,7 +317,7 @@ void free_pages_and_swap_cache(struct
>>> encoded_page **pages, int nr)
>>>    	struct folio_batch folios;
>>>    	unsigned int refs[PAGEVEC_SIZE];
>>>    
>>> -	lru_add_drain();
>>> +	maybe_lru_add_drain();
>>
>> I'm wondering about the reason+effect of this existing call.
>>
>> Seems to date back to the beginning of git.
>>
>> Likely it doesn't make sense to have effectively-free pages in the
>> LRU+mlock cache. But then, this only considers the local CPU
>> LRU/mlock
>> caches ... hmmm
>>
>> So .... do we need this at all? :)
>>
> That is a very good question.
> 
> I think we need to free those pending pages at
> some point. They can't accumulate there forever.
> However, I am not sure where those points should
> be.

The number of entries are limited, so it will regularly be drained under 
normal system operation. Only if we manage to not place any pages on the 
local LRU cache, then they would in fact get stranded there until 
someone does a lru_add_drain_all(), or we perform another operation that 
calls lru_add_drain() on this CPU.

I would assume memory reclaim would drain as well, and I see some calls 
in vmscan.c

Interestingly, we only use the LRU cache for small folios. Even order-1 
folios never strand there. So when freeing a bunch of large folios,
draining might not make sense at all.

> 
> I can think of a few considerations:
> 1) We should consider approximate LRU ordering,
>     and move pages onto the LRU every once in a
>     while.
 > 2) When we are trying to free memory, we should>     maybe ensure not 
too many pages are in these
>     temporary buffers?

folio_batch_add() drains if folio_batch_space() returns 0 (no slots left).

Apparently we have PAGEVEC_SIZE slots, which is 31 ... 31 * PAGE_SIZE 
stranded there.

> 3) For lock batching reasons, we do not want to
>     drain these buffers too frequently.

Yes. I know that we drain in places before we perform some action that 
requires these folios to be marked as LRU, for example, if we want to 
isolate them.

> 
> My patch takes a small step in the direction of
> more batching, but maybe we can take a larger one?

Yes, this "drain on the page freeing path" looks a bit odd to me, but 
maybe there is a good reason why it's been around for decades ... :)

-- 
Cheers,

David / dhildenb