[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <06df816f-b8b7-f6c0-3710-baad99fb3213@linux.alibaba.com>
Date: Mon, 2 Jul 2018 17:01:12 -0700
From: Yang Shi <yang.shi@...ux.alibaba.com>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: mhocko@...nel.org, willy@...radead.org, ldufour@...ux.vnet.ibm.com,
peterz@...radead.org, mingo@...hat.com, acme@...nel.org,
alexander.shishkin@...ux.intel.com, jolsa@...hat.com,
namhyung@...nel.org, tglx@...utronix.de, hpa@...or.com,
linux-mm@...ck.org, x86@...nel.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC v3 PATCH 4/5] mm: mmap: zap pages with read mmap_sem for
large mapping
On 6/29/18 9:26 PM, Yang Shi wrote:
>
>
> On 6/29/18 8:15 PM, Andrew Morton wrote:
>> On Fri, 29 Jun 2018 19:28:15 -0700 Yang Shi
>> <yang.shi@...ux.alibaba.com> wrote:
>>
>>>
>>>> we're adding a bunch of code to 32-bit kernels which will never be
>>>> executed.
>>>>
>>>> I'm thinking it would be better to be much more explicit with "#ifdef
>>>> CONFIG_64BIT" in this code, rather than relying upon the above magic.
>>>>
>>>> But I tend to think that the fact that we haven't solved anything on
>>>> locked vmas or on uprobed mappings is a shostopper for the whole
>>>> approach :(
>>> I agree it is not that perfect. But, it still could improve the most
>>> use
>>> cases.
>> Well, those unaddressed usecases will need to be fixed at some point.
>
> Yes, definitely.
>
>> What's our plan for that?
>
> As I mentioned in the earlier email, locked and hugetlb cases might be
> able to be solved by separating vm_flags update and actual unmap. I
> will look into it further later.
By looking into this further, I think both mlocked and hugetlb vmas can
be handled.
For mlocked vmas, it is easy since we acquires write mmap_sem before
unmapping, so VM_LOCK flags can be cleared here then unmap, just like
what the regular path does.
For hugetlb vmas, the VM_MAYSHARE flag is just checked by
huge_pmd_share() in hugetlb_fault()->huge_pte_alloc(), another call site
is dup_mm()->copy_page_range()->copy_hugetlb_page_range(), we don't care
this call chain in this case.
So we may expand VM_DEAD to hugetlb_fault(). Michal suggested to check
VM_DEAD in check_stable_address_space(), so it would be called in
hugetlb_fault() too (not in current code), then the page fault handler
would bail out before huge_pte_alloc() is called.
With this trick, we don't have to care about when the vm_flags is
updated, we can unmap hugetlb vmas in read mmap_sem critical section,
then update the vm_flags with write mmap_sem held or before the unmap.
Yang
>
> From my point of view, uprobe mapping sounds not that vital.
>
>>
>> Would one of your earlier designs have addressed all usecases? I
>> expect the dumb unmap-a-little-bit-at-a-time approach would have?
>
> Yes. The v1 design does unmap with holding write map_sem. So, the
> vm_flags update is not a problem.
>
> Thanks,
> Yang
>
>>
>>> For the locked vmas and hugetlb vmas, unmapping operations need modify
>>> vm_flags. But, I'm wondering we might be able to separate unmap and
>>> vm_flags update. Because we know they will be unmapped right away, the
>>> vm_flags might be able to be updated in write mmap_sem critical section
>>> before the actual unmap is called or after it. This is just off the top
>>> of my head.
>>>
>>> For uprobed mappings, I'm not sure how vital it is to this case.
>>>
>>> Thanks,
>>> Yang
>>>
>
Powered by blists - more mailing lists