linux-kernel - Re: [External] Re: [PATCH v2] mm: add new syscall pidfd_set

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <82c9c89c-aee2-08a3-e562-359631bb0137@bytedance.com>
Date:   Tue, 15 Nov 2022 15:39:02 +0800
From:   Zhongkun He <hezhongkun.hzk@...edance.com>
To:     Michal Hocko <mhocko@...e.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>, corbet@....net,
        linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        linux-api@...r.kernel.org, linux-doc@...r.kernel.org
Subject: Re: [External] Re: [PATCH v2] mm: add new syscall
 pidfd_set_mempolicy().

>>> We shouldn't really rely on mmap_sem for this IMO.
>>
>>   Yes, We should rely on mmap_sem for vma->vm_policy,but not for
>>   process context policy(task->mempolicy).
> 
> But the caller has no way to know which kind of policy is returned so
> the locking cannot be conditional on the policy type.

Yes. vma->vm_policy is protected by mmap_sem, which is reliable if
we want to add a new apis(pidfd_mbind()) to change the vma->vm_policy
specified in pidfd. but not for pidfd_set_mempolicy(task->mempolicy is
protected by alloc_lock).

> 
> Yes this is all understood but the level of the overhead is not really
> clear. So the question is whether this will induce a visible overhead.
OK,i will try it.

> Because from the maintainability point of view it is much less costly to
> have a clear life time model. Right now we have a mix of reference
> counting and per-task requirements which is rather subtle and easy to
> get wrong. In an ideal world we would have get_vma_policy always
> returning a reference counted policy or NULL. If we really need to
> optimize for cache line bouncing we can go with per cpu reference
> counters (something that was not available at the time the mempolicy
> code has been introduced).
> 
> So I am not saying that the task_work based solution is not possible I
> just think that this looks like a good opportunity to get from the
> existing subtle model.

OK, i got it. Thanks for your reply and suggestions.


Zhongkun.