[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <82c9c89c-aee2-08a3-e562-359631bb0137@bytedance.com>
Date: Tue, 15 Nov 2022 15:39:02 +0800
From: Zhongkun He <hezhongkun.hzk@...edance.com>
To: Michal Hocko <mhocko@...e.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>, corbet@....net,
linux-mm@...ck.org, linux-kernel@...r.kernel.org,
linux-api@...r.kernel.org, linux-doc@...r.kernel.org
Subject: Re: [External] Re: [PATCH v2] mm: add new syscall
pidfd_set_mempolicy().
>>> We shouldn't really rely on mmap_sem for this IMO.
>>
>> Yes, We should rely on mmap_sem for vma->vm_policy,but not for
>> process context policy(task->mempolicy).
>
> But the caller has no way to know which kind of policy is returned so
> the locking cannot be conditional on the policy type.
Yes. vma->vm_policy is protected by mmap_sem, which is reliable if
we want to add a new apis(pidfd_mbind()) to change the vma->vm_policy
specified in pidfd. but not for pidfd_set_mempolicy(task->mempolicy is
protected by alloc_lock).
>
> Yes this is all understood but the level of the overhead is not really
> clear. So the question is whether this will induce a visible overhead.
OK,i will try it.
> Because from the maintainability point of view it is much less costly to
> have a clear life time model. Right now we have a mix of reference
> counting and per-task requirements which is rather subtle and easy to
> get wrong. In an ideal world we would have get_vma_policy always
> returning a reference counted policy or NULL. If we really need to
> optimize for cache line bouncing we can go with per cpu reference
> counters (something that was not available at the time the mempolicy
> code has been introduced).
>
> So I am not saying that the task_work based solution is not possible I
> just think that this looks like a good opportunity to get from the
> existing subtle model.
OK, i got it. Thanks for your reply and suggestions.
Zhongkun.
Powered by blists - more mailing lists