linux-kernel - Re: [PATCH] mm, oom: Add lru_add_drain() in __oom_reap_task

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1d866f1b-94b3-43ec-8f4c-2de31b82d3d1@oracle.com>
Date: Wed, 10 Jan 2024 11:02:03 -0800
From: Jianfeng Wang <jianfeng.w.wang@...cle.com>
To: Michal Hocko <mhocko@...e.com>
Cc: akpm@...ux-foundation.org, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm, oom: Add lru_add_drain() in __oom_reap_task_mm()

On 1/10/24 12:46 AM, Michal Hocko wrote:
> On Tue 09-01-24 01:15:11, Jianfeng Wang wrote:
>> The oom_reaper tries to reclaim additional memory owned by the oom
>> victim. In __oom_reap_task_mm(), it uses mmu_gather for batched page
>> free. After oom_reaper was added, mmu_gather feature introduced
>> CONFIG_MMU_GATHER_NO_GATHER (in 'commit 952a31c9e6fa ("asm-generic/tlb:
>> Introduce CONFIG_HAVE_MMU_GATHER_NO_GATHER=y")', an option to skip batched
>> page free. If set, tlb_batch_pages_flush(), which is responsible for
>> calling lru_add_drain(), is skipped during tlb_finish_mmu(). Without it,
>> pages could still be held by per-cpu fbatches rather than be freed.
>>
>> This fix adds lru_add_drain() prior to mmu_gather. This makes the code
>> consistent with other cases where mmu_gather is used for freeing pages.
> 
> Does this fix any actual problem or is this pure code consistency thing?
> I am asking because it doesn't make much sense to me TBH, LRU cache
> draining is usually important when we want to ensure that cached pages
> are put to LRU to be dealt with because otherwise the MM code wouldn't
> be able to deal with them. OOM reaper doesn't necessarily run on the
> same CPU as the oom victim so draining on a local CPU doesn't
> necessarily do anything for the victim's pages.
> 
> While this patch is not harmful I really do not see much point in adding
> the local draining here. Could you clarify please?
> 
It targets the case described in the patch's commit message: oom_killer
thinks that it 'reclaims' pages while pages are still held by per-cpu
fbatches with a ref count.

I admit that pages may sit on a different core(s). Given that
doing remote calls to all CPUs with lru_add_drain_all() is expensive,
this line of code can be helpful if it happens to give back a few pages
to the system right away without the overhead, especially when oom is
involved. Plus, it also makes the code consistent with other places
using mmu_gather feature to free pages in batch.

--JW

>> Signed-off-by: Jianfeng Wang <jianfeng.w.wang@...cle.com>
>> ---
>>  mm/oom_kill.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
>> index 9e6071fde34a..e2fcf4f062ea 100644
>> --- a/mm/oom_kill.c
>> +++ b/mm/oom_kill.c
>> @@ -537,6 +537,7 @@ static bool __oom_reap_task_mm(struct mm_struct *mm)
>>  			struct mmu_notifier_range range;
>>  			struct mmu_gather tlb;
>>  
>> +			lru_add_drain();
>>  			mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0,
>>  						mm, vma->vm_start,
>>  						vma->vm_end);
>> -- 
>> 2.42.1
>>
>