lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <d2c24550-1274-4611-a26e-24b204da8778@vivo.com>
Date: Tue, 9 Jul 2024 12:23:54 +0800
From: zhiguojiang <justinjiang@...o.com>
To: Barry Song <baohua@...nel.org>
Cc: Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
 LKML <linux-kernel@...r.kernel.org>, opensource.kernel@...o.com,
 David Hildenbrand <david@...hat.com>, Matthew Wilcox <willy@...radead.org>
Subject: Re: [PATCH v5] mm: shrink skip folio mapped by an exiting process



在 2024/7/9 5:34, Barry Song 写道:
> On Tue, Jul 9, 2024 at 1:11 AM zhiguojiang <justinjiang@...o.com> wrote:
>>
>>
>> 在 2024/7/8 20:41, Barry Song 写道:
>>>
>>> zhiguojiang <justinjiang@...o.com> 于 2024年7月9日周二 00:25写道:
>>>
>>>
>>>
>>>      在 2024/7/8 20:17, zhiguojiang 写道:
>>>      >
>>>      >
>>>      > 在 2024/7/8 19:02, Barry Song 写道:
>>>      >> On Mon, Jul 8, 2024 at 9:04 PM Zhiguo Jiang <justinjiang@...o.com>
>>>      >> wrote:
>>>      >>> The releasing process of the non-shared anonymous folio mapped
>>>      >>> solely by
>>>      >>> an exiting process may go through two flows: 1) the anonymous
>>>      folio is
>>>      >>> firstly is swaped-out into swapspace and transformed into a
>>>      swp_entry
>>>      >>> in shrink_folio_list; 2) then the swp_entry is released in the
>>>      process
>>>      >>> exiting flow. This will increase the cpu load of releasing a
>>>      non-shared
>>>      >>> anonymous folio mapped solely by an exiting process, because
>>>      the folio
>>>      >>> go through swap-out and the releasing the swapspace and swp_entry.
>>>      >>>
>>>      >>> When system is low memory, it is more likely to occur, because
>>>      more
>>>      >>> backend applidatuions will be killed.
>>>      >>>
>>>      >>> The modification is that shrink skips the non-shared anonymous
>>>      folio
>>>      >>> solely mapped by an exting process and the folio is only released
>>>      >>> directly in the process exiting flow, which will save swap-out
>>>      time
>>>      >>> and alleviate the load of the process exiting.
>>>      >>>
>>>      >>> Signed-off-by: Zhiguo Jiang <justinjiang@...o.com>
>>>      >>> ---
>>>      >>>
>>>      >>> Change log:
>>>      >>> v4->v5:
>>>      >>> 1.Modify to skip non-shared anonymous folio only.
>>>      >>> 2.Update comments for pra->referenced = -1.
>>>      >>> v3->v4:
>>>      >>> 1.Modify that the unshared folios mapped only in exiting task
>>>      are skip.
>>>      >>> v2->v3:
>>>      >>> Nothing.
>>>      >>> v1->v2:
>>>      >>> 1.The VM_EXITING added in v1 patch is removed, because it will
>>>      fail
>>>      >>> to compile in 32-bit system.
>>>      >>>
>>>      >>>   mm/rmap.c   | 13 +++++++++++++
>>>      >>>   mm/vmscan.c |  7 ++++++-
>>>      >>>   2 files changed, 19 insertions(+), 1 deletion(-)
>>>      >>>
>>>      >>> diff --git a/mm/rmap.c b/mm/rmap.c
>>>      >>> index 26806b49a86f..5b5281d71dbb
>>>      >>> --- a/mm/rmap.c
>>>      >>> +++ b/mm/rmap.c
>>>      >>> @@ -843,6 +843,19 @@ static bool folio_referenced_one(struct
>>>      folio
>>>      >>> *folio,
>>>      >>>          int referenced = 0;
>>>      >>>          unsigned long start = address, ptes = 0;
>>>      >>>
>>>      >>> +       /*
>>>      >>> +        * Skip the non-shared anonymous folio mapped solely by
>>>      >>> +        * the single exiting process, and release it directly
>>>      >>> +        * in the process exiting.
>>>      >>> +        */
>>>      >>> +       if ((!atomic_read(&vma->vm_mm->mm_users) ||
>>>      >>> +               test_bit(MMF_OOM_SKIP, &vma->vm_mm->flags)) &&
>>>      >>> +               folio_test_anon(folio) &&
>>>      >>> folio_test_swapbacked(folio) &&
>>>      >>> + !folio_likely_mapped_shared(folio)) {
>>>      >>> +               pra->referenced = -1;
>>>      >>> +               return false;
>>>      >>> +       }
>>>      >>> +
>>>      >>>          while (page_vma_mapped_walk(&pvmw)) {
>>>      >>>                  address = pvmw.address;
>>>      > Sure, I agree with your modification suggestions. This way,
>>>      using PTL
>>>      > indeed sure
>>>      > that the folio is mapped by this process.
>>>      > Thanks
>>>      >> As David suggested, what about the below?
>>>      >>
>>>      >> @@ -883,6 +870,21 @@ static bool folio_referenced_one(struct folio
>>>      >> *folio,
>>>      >>                          continue;
>>>      >>                  }
>>>      >>
>>>      >> +               /*
>>>      >> +                * Skip the non-shared anonymous folio mapped
>>>      solely by
>>>      >> +                * the single exiting process, and release it
>>>      directly
>>>      >> +                * in the process exiting.
>>>      >> +                */
>>>      >> +               if ((!atomic_read(&vma->vm_mm->mm_users) ||
>>>      >> + test_bit(MMF_OOM_SKIP,
>>>      >> &vma->vm_mm->flags)) &&
>>>      >> + folio_test_anon(folio) &&
>>>      >> folio_test_swapbacked(folio) &&
>>>      >> + !folio_likely_mapped_shared(folio)) {
>>>      >> +                       pra->referenced = -1;
>>>      >> + page_vma_mapped_walk_done(&pvmw);
>>>      >> +                       return false;
>>>      >> +               }
>>>      >> +
>>>      >>                  if (pvmw.pte) {
>>>      >>                          if (lru_gen_enabled() &&
>>>      >> pte_young(ptep_get(pvmw.pte))) {
>>>      >>
>>>      >>
>>>      >> By the way, I am not convinced that using test_bit(MMF_OOM_SKIP,
>>>      >> &vma->vm_mm->flags) is
>>>      >> correct (I think it is wrong).   For example, global_init can
>>>      >> directly have it:
>>>      >>                  if (is_global_init(p)) {
>>>      >>                          can_oom_reap = false;
>>>      >>                          set_bit(MMF_OOM_SKIP, &mm->flags);
>>>      >>                          pr_info("oom killer %d (%s) has mm
>>>      pinned by
>>>      >> %d (%s)\n",
>>>      >> task_pid_nr(victim),
>>>      >> victim->comm,
>>>      >> task_pid_nr(p), p->comm);
>>>      >>                          continue;
>>>      >>                  }
>>>      >>
>>>      >> And exit_mmap() automatically has MMF_OOM_SKIP.
>>>      >>
>>>      >> What is the purpose of this check? Is there a better way to
>>>      determine
>>>      >> if a process is an
>>>      >> OOM target? What about check_stable_address_space() ?
>>>      > 1.Sorry, I overlook the situation with if (is_global_init(p)),
>>>      > MMF_OOM_SKIP is indeed not suitable.
>>>      >
>>>      > 2.check_stable_address_space() can indicate oom_reaper, but it
>>>      seems
>>>      > unable to identify the situation where the process exits normally.
>>>      > What about task_is_dying()? static inline bool
>>>      task_is_dying(void) {
>>>      > return tsk_is_oom_victim(current) ||
>>>      fatal_signal_pending(current) ||
>>>      > (current->flags & PF_EXITING); } Thanks
>>>      We can migrate task_is_dying() from mm/memcontrol.c to
>>>      include/linux/oom.h
>>>      > static inline bool task_is_dying(void)
>>>      > {
>>>      >     return tsk_is_oom_victim(current) ||
>>>      fatal_signal_pending(current) ||
>>>      >         (current->flags & PF_EXITING);
>>>      > }
>>>
>>>
>>> no. current is kswapd.
>> Hi Barry,
>>
>> It seems feasible for check_stable_address_space() replacing MMF_OOM_SKIP.
>> check_stable_address_space() can indicate oom kill, and
>> !atomic_read(&vma->vm_mm->mm_users)
>> can indicate the normal process exiting.
>>
>>           /*
>>            * Skip the non-shared anonymous folio mapped solely by
>>            * the single exiting process, and release it directly
>>            * in the process exiting.
>>            */
>>           if ((!atomic_read(&vma->vm_mm->mm_users) ||
>>               check_stable_address_space(vma->vm_mm)) &&
>>               folio_test_anon(folio) && folio_test_swapbacked(folio) &&
>>               !folio_likely_mapped_shared(folio)) {
>>               pra->referenced = -1;
>>               page_vma_mapped_walk_done(&pvmw);
>>               return false;
>>           }
>>
> Yes, + David, Willy (when you send a new version, please CC people who have
> participated and describe how you have addressed comments from those
> people.)
>
> I also think we actually can remove "folio_test_anon(folio)".
>
> So It could be,
>
> @@ -883,6 +871,21 @@ static bool folio_referenced_one(struct folio *folio,
>                          continue;
>                  }
>
> +               /*
> +                * Skip the non-shared swapbacked folio mapped solely by
> +                * the exiting or OOM-reaped process. This avoids redundant
> +                * swap-out followed by an immediate unmap.
> +                */
> +               if ((!atomic_read(&vma->vm_mm->mm_users) ||
> +                    check_stable_address_space(vma->vm_mm)) &&
> +                    folio_test_swapbacked(folio) &&
> +                    !folio_likely_mapped_shared(folio)) {
> +                       pra->referenced = -1;
> +                       page_vma_mapped_walk_done(&pvmw);
> +                       return false;
> +               }
> +
>                  if (pvmw.pte) {
>                          if (lru_gen_enabled() &&
>                              pte_young(ptep_get(pvmw.pte))) {
Ok,  update in patch v6:
https://lore.kernel.org/linux-mm/20240709042122.631-1-justinjiang@vivo.com/

Thanks
>
>> Thanks
>> Zhiguo
>>>
>>>      >>
>>>      >>
>>>      >>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>>>      >>> index 0761f91b407f..bae7a8bf6b3d
>>>      >>> --- a/mm/vmscan.c
>>>      >>> +++ b/mm/vmscan.c
>>>      >>> @@ -863,7 +863,12 @@ static enum folio_references
>>>      >>> folio_check_references(struct folio *folio,
>>>      >>>          if (vm_flags & VM_LOCKED)
>>>      >>>                  return FOLIOREF_ACTIVATE;
>>>      >>>
>>>      >>> -       /* rmap lock contention: rotate */
>>>      >>> +       /*
>>>      >>> +        * There are two cases to consider.
>>>      >>> +        * 1) Rmap lock contention: rotate.
>>>      >>> +        * 2) Skip the non-shared anonymous folio mapped solely by
>>>      >>> +        *    the single exiting process.
>>>      >>> +        */
>>>      >>>          if (referenced_ptes == -1)
>>>      >>>                  return FOLIOREF_KEEP;
>>>      >>>
>>>      >>> --
>>>      >>> 2.39.0
>>>      >>>
>>>      >> Thanks
>>>      >> Barry
>>>      >
>>>
> Thanks
> Barry


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ