lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <afd3a7bf-cd45-9c59-a853-d49d82ee87da@intel.com>
Date:   Sat, 8 Jul 2023 12:36:22 +0800
From:   "Yin, Fengwei" <fengwei.yin@...el.com>
To:     Matthew Wilcox <willy@...radead.org>
CC:     David Hildenbrand <david@...hat.com>, <linux-mm@...ck.org>,
        <linux-kernel@...r.kernel.org>, <yuzhao@...gle.com>,
        <ryan.roberts@....com>, <shy828301@...il.com>,
        <akpm@...ux-foundation.org>
Subject: Re: [RFC PATCH 0/3] support large folio for mlock



On 7/8/2023 12:02 PM, Matthew Wilcox wrote:
> On Sat, Jul 08, 2023 at 11:52:23AM +0800, Yin, Fengwei wrote:
>>> Oh, I agree, there are always going to be circumstances where we realise
>>> we've made a bad decision and can't (easily) undo it.  Unless we have a
>>> per-page pincount, and I Would Rather Not Do That.  But we should _try_
>>> to do that because it's the right model -- that's what I meant by "Tell
>>> me why I'm wrong"; what scenarios do we have where a user temporarilly
>>> mlocks (or mprotects or ...) a range of memory, but wants that memory
>>> to be aged in the LRU exactly the same way as the adjacent memory that
>>> wasn't mprotected?
>> for manpage of mlock():
>>        mlock(),  mlock2(), and mlockall() lock part or all of the calling process's virtual address space into RAM, preventing that memory
>>        from being paged to the swap area.
>>
>> So my understanding is it's OK to let the memory mlocked to be aged with
>> the adjacent memory which is not mlocked. Just make sure they are not
>> paged out to swap.
> 
> Right, it doesn't break anything; it's just a similar problem to
> internal fragmentation.  The pages of the folio which aren't mlocked
> will also be locked in RAM and never paged out.
This patchset doesn't mlock the large folio cross VMA boundary. So page
reclaim can pick the large folio and split it. Then the pages out of
VM_LOCKED VMA range will be paged out. Or did I miss something here?

> 
>> One question for implementation detail:
>>   If the large folio cross VMA boundary can not be split, how do we
>>   deal with this case? Retry in syscall till it's split successfully?
>>   Or return error (and what ERRORS should we choose) to user space?
> 
> I would be tempted to allocate memory & copy to the new mlocked VMA.
> The old folio will go on the deferred_list and be split later, or its
> valid parts will be written to swap and then it can be freed.

OK. This can be common operations for any case that splitting VMA triggers
large folio splitting but splitting fails.


Regards
Yin, Fengwei

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ