[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <49c17035-1b8c-5fa3-9944-33467589d1f1@linux.alibaba.com>
Date: Thu, 12 Apr 2018 09:20:24 -0700
From: Yang Shi <yang.shi@...ux.alibaba.com>
To: Michal Hocko <mhocko@...nel.org>
Cc: Cyrill Gorcunov <gorcunov@...il.com>, adobriyan@...il.com,
willy@...radead.org, mguzik@...hat.com, akpm@...ux-foundation.org,
linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [v3 PATCH] mm: introduce arg_lock to protect arg_start|end and
env_start|end in mm_struct
On 4/12/18 5:18 AM, Michal Hocko wrote:
> On Tue 10-04-18 11:28:13, Yang Shi wrote:
>>
>> On 4/10/18 9:21 AM, Yang Shi wrote:
>>>
>>> On 4/10/18 5:28 AM, Cyrill Gorcunov wrote:
>>>> On Tue, Apr 10, 2018 at 01:10:01PM +0200, Michal Hocko wrote:
>>>>>> Because do_brk does vma manipulations, for this reason it's
>>>>>> running under down_write_killable(&mm->mmap_sem). Or you
>>>>>> mean something else?
>>>>> Yes, all we need the new lock for is to get a consistent view on brk
>>>>> values. I am simply asking whether there is something fundamentally
>>>>> wrong by doing the update inside the new lock while keeping the
>>>>> original
>>>>> mmap_sem locking in the brk path. That would allow us to drop the
>>>>> mmap_sem lock in the proc path when looking at brk values.
>>>> Michal gimme some time. I guess we might do so, but I need some
>>>> spare time to take more precise look into the code, hopefully today
>>>> evening. Also I've a suspicion that we've wracked check_data_rlimit
>>>> with this new lock in prctl. Need to verify it again.
>>> I see you guys points. We might be able to move the drop of mmap_sem
>>> before setting mm->brk in sys_brk since mmap_sem should be used to
>>> protect vma manipulation only, then protect the value modify with the
>>> new arg_lock. Then we can eliminate mmap_sem stuff in prctl path, and it
>>> also prevents from wrecking check_data_rlimit.
>>>
>>> At the first glance, it looks feasible to me. Will look into deeper
>>> later.
>> A further look told me this might be *not* feasible.
>>
>> It looks the new lock will not break check_data_rlimit since in my patch
>> both start_brk and brk is protected by mmap_sem. The code flow might look
>> like below:
>>
>> CPU A CPU B
>> -------- --------
>> prctl sys_brk
>> down_write
>> check_data_rlimit check_data_rlimit (need mm->start_brk)
>> set brk
>> down_write up_write
>> set start_brk
>> set brk
>> up_write
>>
>>
>> If CPU A gets the mmap_sem first, it will set start_brk and brk, then CPU B
>> will check with the new start_brk. And, prctl doesn't care if sys_brk is run
>> before it since it gets the new start_brk and brk from parameter.
>>
>> If we protect start_brk and brk with the new lock, sys_brk might get old
>> start_brk, then sys_brk might break rlimit check silently, is that right?
>>
>> So, it looks using new lock in prctl and keeping mmap_sem in brk path has
>> race condition.
> OK, I've admittedly didn't give it too much time to think about. Maybe
> we do something clever to remove the race but can we start at least by
> reducing the write lock to read on prctl side and use the dedicated
> spinlock for updating values? That should close the above race AFAICS
> and the read lock would be much more friendly to other VM operations.
Yes, is sounds feasible. We just need care about prctl is run before
sys_brk. So, you mean:
down_read
spin_lock
update all the values
spin_unlock
up_read
>
Powered by blists - more mailing lists