linux-kernel - Re: [RFC PATCH 0/3] support large folio for mlock

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ca1df2b0-3a36-2762-f20e-b4a235087c9d@intel.com>
Date:   Sat, 8 Jul 2023 13:01:06 +0800
From:   "Yin, Fengwei" <fengwei.yin@...el.com>
To:     Yu Zhao <yuzhao@...gle.com>
CC:     <linux-mm@...ck.org>, <linux-kernel@...r.kernel.org>,
        <ryan.roberts@....com>, <shy828301@...il.com>,
        <akpm@...ux-foundation.org>, <willy@...radead.org>,
        <david@...hat.com>
Subject: Re: [RFC PATCH 0/3] support large folio for mlock



On 7/8/2023 12:45 PM, Yu Zhao wrote:
> On Fri, Jul 7, 2023 at 10:52 AM Yin Fengwei <fengwei.yin@...el.com> wrote:
>>
>> Yu mentioned at [1] about the mlock() can't be applied to large folio.
>>
>> I leant the related code and here is my understanding:
>> - For RLIMIT_MEMLOCK related, there is no problem. Becuase the
>>   RLIMIT_MEMLOCK statistics is not related underneath page. That means
>>   underneath page mlock or munlock doesn't impact the RLIMIT_MEMLOCK
>>   statistics collection which is always correct.
>>
>> - For keeping the page in RAM, there is no problem either. At least,
>>   during try_to_unmap_one(), once detect the VMA has VM_LOCKED bit
>>   set in vm_flags, the folio will be kept whatever the folio is
>>   mlocked or not.
>>
>> So the function of mlock for large folio works. But it's not optimized
>> because the page reclaim needs scan these large folio and may split
>> them.
>>
>> This series identified the large folio for mlock to two types:
>>   - The large folio is in VM_LOCKED VMA range
>>   - The large folio cross VM_LOCKED VMA boundary
>>
>> For the first type, we mlock large folio so page relcaim will skip it.
>> For the second type, we don't mlock large folio. It's allowed to be
>> picked by page reclaim and be split. So the pages not in VM_LOCKED VMA
>> range are allowed to be reclaimed/released.
> 
> This is a sound design, which is also what I have in mind. I see the
> rationales are being spelled out in this thread, and hopefully
> everyone can be convinced.
> 
>> patch1 introduce API to check whether large folio is in VMA range.
>> patch2 make page reclaim/mlock_vma_folio/munlock_vma_folio support
>> large folio mlock/munlock.
>> patch3 make mlock/munlock syscall support large folio.
> 
> Could you tidy up the last patch a little bit? E.g., Saying "mlock the
> 4K folio" is obviously not the best idea.
> 
> And if it's possible, make the loop just look like before, i.e.,
> 
>   if (!can_mlock_entire_folio())
>     continue;
>   if (vma->vm_flags & VM_LOCKED)
>     mlock_folio_range();
>   else
>     munlock_folio_range();
This can make large folio mlocked() even user space call munlock()
to the range. Considering following case:
  1. mlock() 64K range and underneath 64K large folio is mlocked().
  2. mprotect the first 32K range to different prot and triggers
     VMA split.
  3. munlock() 64K range. As 64K large folio doesn't in these two
     new VMAs range, it will not be munlocked() and only can be
     reclaimed after it's unmapped from two VMAs instead of after
     the range is munlocked().


Regards
Yin, Fengwei

> 
> Thanks.