linux-kernel - Re: [PATCH] mm/gup: don't check page lru flag before draining it

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <48fb0e58-16d1-7956-cf35-74741826617a@126.com>
Date: Sat, 8 Jun 2024 12:38:49 +0800
From: yangge1116 <yangge1116@....com>
To: David Hildenbrand <david@...hat.com>,
 Baolin Wang <baolin.wang@...ux.alibaba.com>, akpm@...ux-foundation.org,
 Matthew Wilcox <willy@...radead.org>
Cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org, liuzixing@...on.cn
Subject: Re: [PATCH] mm/gup: don't check page lru flag before draining it



在 2024/6/6 下午3:56, David Hildenbrand 写道:
>>> Some random thoughts about some folio_test_lru() users:
>>>
>>> mm/khugepaged.c: skips pages if !folio_test_lru(), but would fail skip
>>> it either way if there is the unexpected reference from the LRU batch!
>>>
>>> mm/compaction.c: skips pages if !folio_test_lru(), but would fail skip
>>> it either way if there is the unexpected reference from the LRU batch!
>>>
>>> mm/memory.c: would love to identify this case and to a lru_add_drain()
>>> to free up that reference.
>>>
>>> mm/huge_memory.c: splitting with the additional reference will fail
>>> already. Maybe we'd want to drain the LRU batch.
>>
>> Agree.
>>
>>>
>>> mm/madvise.c: skips pages if !folio_test_lru(). I wonder what happens if
>>> we have the same page twice in an LRU batch with different target 
>>> goals ...
>>
>> IIUC, LRU batch can ignore this folio since it's LRU flag is cleared by
>> folio_isolate_lru(), then will call folios_put() to frop the reference.
>>
> 
> I think what's interesting to highlight in the current design is that a 
> folio might end up in multiple LRU batches, and whatever the result will 
> be is determined by the sequence of them getting flushed. Doesn't sound 
> quite right but maybe there was a reason for it (which could just have 
> been "simpler implementation").
> 
>>
>>> Some other users (there are not that many that don't use it for sanity
>>> checks though) might likely be a bit different.
> 
> There are also some PageLRU checks, but not that many.
> 
>>
>> mm/page_isolation.c: fail to set pageblock migratetype to isolate if
>> !folio_test_lru(), then alloc_contig_range_noprof() can be failed. But
>> the original code could set pageblock migratetype to isolate, then
>> calling drain_all_pages() in alloc_contig_range_noprof() to drop
>> reference of the LRU batch.
>>
>> mm/vmscan.c: will call lru_add_drain() before calling
>> isolate_lru_folios(), so seems no impact.
> 
> lru_add_drain() will only drain the local CPU. So if the folio would be 
> stuck on another CPU's LRU batch, right now we could isolate it. When 
> processing that LRU batch while the folio is still isolated, it would 
> currently simply skip the operation.
> 
> So right now we can call isolate_lru_folios() even if the folio is stuck 
> on another CPU's LRU batch.
> 
> We cannot really reclaim the folio as long is it is in another CPU's LRU 
> batch, though (unexpected reference).
> 
>>
>> BTW, we also need to look at the usage of folio_isolate_lru().
> 
> Yes.
> 
>>
>> It doesn’t seem to have major obstacles, but there are many details to
>> analyze :)
> 
> Yes, we're only scratching the surface.
> 
> Having a way to identify "this folio is very likely some CPU's LRU 
> batch"  could end up being quite valuable, because likely we don't want 
> to blindly drain the LRU simply because there is some unexpected 
> reference on a folio [as we would in this patch].
> 

Can we add a PG_lru_batch flag to determine whether a page is in lru 
batch? If we can, seems this problem will be easier.