[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAOUHufY0=EW65tD01mm6ha75XWjcc43aGVuSJ8AfPc+dDLH6ZA@mail.gmail.com>
Date: Fri, 7 Jul 2023 23:06:47 -0600
From: Yu Zhao <yuzhao@...gle.com>
To: "Yin, Fengwei" <fengwei.yin@...el.com>
Cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org,
ryan.roberts@....com, shy828301@...il.com,
akpm@...ux-foundation.org, willy@...radead.org, david@...hat.com
Subject: Re: [RFC PATCH 0/3] support large folio for mlock
On Fri, Jul 7, 2023 at 11:01 PM Yin, Fengwei <fengwei.yin@...el.com> wrote:
>
>
>
> On 7/8/2023 12:45 PM, Yu Zhao wrote:
> > On Fri, Jul 7, 2023 at 10:52 AM Yin Fengwei <fengwei.yin@...el.com> wrote:
> >>
> >> Yu mentioned at [1] about the mlock() can't be applied to large folio.
> >>
> >> I leant the related code and here is my understanding:
> >> - For RLIMIT_MEMLOCK related, there is no problem. Becuase the
> >> RLIMIT_MEMLOCK statistics is not related underneath page. That means
> >> underneath page mlock or munlock doesn't impact the RLIMIT_MEMLOCK
> >> statistics collection which is always correct.
> >>
> >> - For keeping the page in RAM, there is no problem either. At least,
> >> during try_to_unmap_one(), once detect the VMA has VM_LOCKED bit
> >> set in vm_flags, the folio will be kept whatever the folio is
> >> mlocked or not.
> >>
> >> So the function of mlock for large folio works. But it's not optimized
> >> because the page reclaim needs scan these large folio and may split
> >> them.
> >>
> >> This series identified the large folio for mlock to two types:
> >> - The large folio is in VM_LOCKED VMA range
> >> - The large folio cross VM_LOCKED VMA boundary
> >>
> >> For the first type, we mlock large folio so page relcaim will skip it.
> >> For the second type, we don't mlock large folio. It's allowed to be
> >> picked by page reclaim and be split. So the pages not in VM_LOCKED VMA
> >> range are allowed to be reclaimed/released.
> >
> > This is a sound design, which is also what I have in mind. I see the
> > rationales are being spelled out in this thread, and hopefully
> > everyone can be convinced.
> >
> >> patch1 introduce API to check whether large folio is in VMA range.
> >> patch2 make page reclaim/mlock_vma_folio/munlock_vma_folio support
> >> large folio mlock/munlock.
> >> patch3 make mlock/munlock syscall support large folio.
> >
> > Could you tidy up the last patch a little bit? E.g., Saying "mlock the
> > 4K folio" is obviously not the best idea.
> >
> > And if it's possible, make the loop just look like before, i.e.,
> >
> > if (!can_mlock_entire_folio())
> > continue;
> > if (vma->vm_flags & VM_LOCKED)
> > mlock_folio_range();
> > else
> > munlock_folio_range();
> This can make large folio mlocked() even user space call munlock()
> to the range. Considering following case:
> 1. mlock() 64K range and underneath 64K large folio is mlocked().
> 2. mprotect the first 32K range to different prot and triggers
> VMA split.
> 3. munlock() 64K range. As 64K large folio doesn't in these two
> new VMAs range, it will not be munlocked() and only can be
> reclaimed after it's unmapped from two VMAs instead of after
> the range is munlocked().
I understand. I'm asking to factor the code, not to change the logic.
Powered by blists - more mailing lists