[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <d2c24550-1274-4611-a26e-24b204da8778@vivo.com>
Date: Tue, 9 Jul 2024 12:23:54 +0800
From: zhiguojiang <justinjiang@...o.com>
To: Barry Song <baohua@...nel.org>
Cc: Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
LKML <linux-kernel@...r.kernel.org>, opensource.kernel@...o.com,
David Hildenbrand <david@...hat.com>, Matthew Wilcox <willy@...radead.org>
Subject: Re: [PATCH v5] mm: shrink skip folio mapped by an exiting process
在 2024/7/9 5:34, Barry Song 写道:
> On Tue, Jul 9, 2024 at 1:11 AM zhiguojiang <justinjiang@...o.com> wrote:
>>
>>
>> 在 2024/7/8 20:41, Barry Song 写道:
>>>
>>> zhiguojiang <justinjiang@...o.com> 于 2024年7月9日周二 00:25写道:
>>>
>>>
>>>
>>> 在 2024/7/8 20:17, zhiguojiang 写道:
>>> >
>>> >
>>> > 在 2024/7/8 19:02, Barry Song 写道:
>>> >> On Mon, Jul 8, 2024 at 9:04 PM Zhiguo Jiang <justinjiang@...o.com>
>>> >> wrote:
>>> >>> The releasing process of the non-shared anonymous folio mapped
>>> >>> solely by
>>> >>> an exiting process may go through two flows: 1) the anonymous
>>> folio is
>>> >>> firstly is swaped-out into swapspace and transformed into a
>>> swp_entry
>>> >>> in shrink_folio_list; 2) then the swp_entry is released in the
>>> process
>>> >>> exiting flow. This will increase the cpu load of releasing a
>>> non-shared
>>> >>> anonymous folio mapped solely by an exiting process, because
>>> the folio
>>> >>> go through swap-out and the releasing the swapspace and swp_entry.
>>> >>>
>>> >>> When system is low memory, it is more likely to occur, because
>>> more
>>> >>> backend applidatuions will be killed.
>>> >>>
>>> >>> The modification is that shrink skips the non-shared anonymous
>>> folio
>>> >>> solely mapped by an exting process and the folio is only released
>>> >>> directly in the process exiting flow, which will save swap-out
>>> time
>>> >>> and alleviate the load of the process exiting.
>>> >>>
>>> >>> Signed-off-by: Zhiguo Jiang <justinjiang@...o.com>
>>> >>> ---
>>> >>>
>>> >>> Change log:
>>> >>> v4->v5:
>>> >>> 1.Modify to skip non-shared anonymous folio only.
>>> >>> 2.Update comments for pra->referenced = -1.
>>> >>> v3->v4:
>>> >>> 1.Modify that the unshared folios mapped only in exiting task
>>> are skip.
>>> >>> v2->v3:
>>> >>> Nothing.
>>> >>> v1->v2:
>>> >>> 1.The VM_EXITING added in v1 patch is removed, because it will
>>> fail
>>> >>> to compile in 32-bit system.
>>> >>>
>>> >>> mm/rmap.c | 13 +++++++++++++
>>> >>> mm/vmscan.c | 7 ++++++-
>>> >>> 2 files changed, 19 insertions(+), 1 deletion(-)
>>> >>>
>>> >>> diff --git a/mm/rmap.c b/mm/rmap.c
>>> >>> index 26806b49a86f..5b5281d71dbb
>>> >>> --- a/mm/rmap.c
>>> >>> +++ b/mm/rmap.c
>>> >>> @@ -843,6 +843,19 @@ static bool folio_referenced_one(struct
>>> folio
>>> >>> *folio,
>>> >>> int referenced = 0;
>>> >>> unsigned long start = address, ptes = 0;
>>> >>>
>>> >>> + /*
>>> >>> + * Skip the non-shared anonymous folio mapped solely by
>>> >>> + * the single exiting process, and release it directly
>>> >>> + * in the process exiting.
>>> >>> + */
>>> >>> + if ((!atomic_read(&vma->vm_mm->mm_users) ||
>>> >>> + test_bit(MMF_OOM_SKIP, &vma->vm_mm->flags)) &&
>>> >>> + folio_test_anon(folio) &&
>>> >>> folio_test_swapbacked(folio) &&
>>> >>> + !folio_likely_mapped_shared(folio)) {
>>> >>> + pra->referenced = -1;
>>> >>> + return false;
>>> >>> + }
>>> >>> +
>>> >>> while (page_vma_mapped_walk(&pvmw)) {
>>> >>> address = pvmw.address;
>>> > Sure, I agree with your modification suggestions. This way,
>>> using PTL
>>> > indeed sure
>>> > that the folio is mapped by this process.
>>> > Thanks
>>> >> As David suggested, what about the below?
>>> >>
>>> >> @@ -883,6 +870,21 @@ static bool folio_referenced_one(struct folio
>>> >> *folio,
>>> >> continue;
>>> >> }
>>> >>
>>> >> + /*
>>> >> + * Skip the non-shared anonymous folio mapped
>>> solely by
>>> >> + * the single exiting process, and release it
>>> directly
>>> >> + * in the process exiting.
>>> >> + */
>>> >> + if ((!atomic_read(&vma->vm_mm->mm_users) ||
>>> >> + test_bit(MMF_OOM_SKIP,
>>> >> &vma->vm_mm->flags)) &&
>>> >> + folio_test_anon(folio) &&
>>> >> folio_test_swapbacked(folio) &&
>>> >> + !folio_likely_mapped_shared(folio)) {
>>> >> + pra->referenced = -1;
>>> >> + page_vma_mapped_walk_done(&pvmw);
>>> >> + return false;
>>> >> + }
>>> >> +
>>> >> if (pvmw.pte) {
>>> >> if (lru_gen_enabled() &&
>>> >> pte_young(ptep_get(pvmw.pte))) {
>>> >>
>>> >>
>>> >> By the way, I am not convinced that using test_bit(MMF_OOM_SKIP,
>>> >> &vma->vm_mm->flags) is
>>> >> correct (I think it is wrong). For example, global_init can
>>> >> directly have it:
>>> >> if (is_global_init(p)) {
>>> >> can_oom_reap = false;
>>> >> set_bit(MMF_OOM_SKIP, &mm->flags);
>>> >> pr_info("oom killer %d (%s) has mm
>>> pinned by
>>> >> %d (%s)\n",
>>> >> task_pid_nr(victim),
>>> >> victim->comm,
>>> >> task_pid_nr(p), p->comm);
>>> >> continue;
>>> >> }
>>> >>
>>> >> And exit_mmap() automatically has MMF_OOM_SKIP.
>>> >>
>>> >> What is the purpose of this check? Is there a better way to
>>> determine
>>> >> if a process is an
>>> >> OOM target? What about check_stable_address_space() ?
>>> > 1.Sorry, I overlook the situation with if (is_global_init(p)),
>>> > MMF_OOM_SKIP is indeed not suitable.
>>> >
>>> > 2.check_stable_address_space() can indicate oom_reaper, but it
>>> seems
>>> > unable to identify the situation where the process exits normally.
>>> > What about task_is_dying()? static inline bool
>>> task_is_dying(void) {
>>> > return tsk_is_oom_victim(current) ||
>>> fatal_signal_pending(current) ||
>>> > (current->flags & PF_EXITING); } Thanks
>>> We can migrate task_is_dying() from mm/memcontrol.c to
>>> include/linux/oom.h
>>> > static inline bool task_is_dying(void)
>>> > {
>>> > return tsk_is_oom_victim(current) ||
>>> fatal_signal_pending(current) ||
>>> > (current->flags & PF_EXITING);
>>> > }
>>>
>>>
>>> no. current is kswapd.
>> Hi Barry,
>>
>> It seems feasible for check_stable_address_space() replacing MMF_OOM_SKIP.
>> check_stable_address_space() can indicate oom kill, and
>> !atomic_read(&vma->vm_mm->mm_users)
>> can indicate the normal process exiting.
>>
>> /*
>> * Skip the non-shared anonymous folio mapped solely by
>> * the single exiting process, and release it directly
>> * in the process exiting.
>> */
>> if ((!atomic_read(&vma->vm_mm->mm_users) ||
>> check_stable_address_space(vma->vm_mm)) &&
>> folio_test_anon(folio) && folio_test_swapbacked(folio) &&
>> !folio_likely_mapped_shared(folio)) {
>> pra->referenced = -1;
>> page_vma_mapped_walk_done(&pvmw);
>> return false;
>> }
>>
> Yes, + David, Willy (when you send a new version, please CC people who have
> participated and describe how you have addressed comments from those
> people.)
>
> I also think we actually can remove "folio_test_anon(folio)".
>
> So It could be,
>
> @@ -883,6 +871,21 @@ static bool folio_referenced_one(struct folio *folio,
> continue;
> }
>
> + /*
> + * Skip the non-shared swapbacked folio mapped solely by
> + * the exiting or OOM-reaped process. This avoids redundant
> + * swap-out followed by an immediate unmap.
> + */
> + if ((!atomic_read(&vma->vm_mm->mm_users) ||
> + check_stable_address_space(vma->vm_mm)) &&
> + folio_test_swapbacked(folio) &&
> + !folio_likely_mapped_shared(folio)) {
> + pra->referenced = -1;
> + page_vma_mapped_walk_done(&pvmw);
> + return false;
> + }
> +
> if (pvmw.pte) {
> if (lru_gen_enabled() &&
> pte_young(ptep_get(pvmw.pte))) {
Ok, update in patch v6:
https://lore.kernel.org/linux-mm/20240709042122.631-1-justinjiang@vivo.com/
Thanks
>
>> Thanks
>> Zhiguo
>>>
>>> >>
>>> >>
>>> >>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>>> >>> index 0761f91b407f..bae7a8bf6b3d
>>> >>> --- a/mm/vmscan.c
>>> >>> +++ b/mm/vmscan.c
>>> >>> @@ -863,7 +863,12 @@ static enum folio_references
>>> >>> folio_check_references(struct folio *folio,
>>> >>> if (vm_flags & VM_LOCKED)
>>> >>> return FOLIOREF_ACTIVATE;
>>> >>>
>>> >>> - /* rmap lock contention: rotate */
>>> >>> + /*
>>> >>> + * There are two cases to consider.
>>> >>> + * 1) Rmap lock contention: rotate.
>>> >>> + * 2) Skip the non-shared anonymous folio mapped solely by
>>> >>> + * the single exiting process.
>>> >>> + */
>>> >>> if (referenced_ptes == -1)
>>> >>> return FOLIOREF_KEEP;
>>> >>>
>>> >>> --
>>> >>> 2.39.0
>>> >>>
>>> >> Thanks
>>> >> Barry
>>> >
>>>
> Thanks
> Barry
Powered by blists - more mailing lists