linux-kernel - Re: [External] Re: [PATCH v2] mm: add new syscall pidfd_set

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Y3uNWbPmwHtytKzY@dhcp22.suse.cz>
Date:   Mon, 21 Nov 2022 15:38:17 +0100
From:   Michal Hocko <mhocko@...e.com>
To:     Zhongkun He <hezhongkun.hzk@...edance.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>, corbet@....net,
        linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        linux-api@...r.kernel.org, linux-doc@...r.kernel.org
Subject: Re: [External] Re: [PATCH v2] mm: add new syscall
 pidfd_set_mempolicy().

On Thu 17-11-22 15:19:20, Zhongkun He wrote:
> Hi Michal, thanks for your replay.
> 
> > 
> > It would be better to add the patch that has been tested.
> 
> OK.
> 
> > 
> > One way to deal with that would be to use a similar model as css_tryget
> 
> Percpu_ref is a good way to  reduce memory footprint in fast path.But it
> has the potential to make mempolicy heavy. the sizeof mempolicy is 32
> bytes and it may not have a long life time, which duplicated from the
> parent in fork().If we modify atomic_t to percpu_ref, the efficiency of
> reading in fastpath will increase, the efficiency of creation and
> deletion will decrease, and the occupied space will increase
> significantly.I am not really sure it is worth it.
> 
> atomic_t; 4
> sizeof(percpu_ref + percpu_ref_data + cpus* unsigned long)
> 16+56+cpus*8

Yes the memory consumption is going to increase but the question is
whether this is something that is a real problem. Is it really common to
have many vmas with a dedicated policy?

What I am arguing here is that there are essentially 2 ways forward.
Either we continue to build up on top of the existing and arguably very
fragile code and make it even more subtle or follow a general pattern of
a proper reference counting (with usual tricks to reduce cache line
bouncing and similar issues). I do not really see why memory policies
should be any different and require very special treatment.

> > Btw. have you tried to profile those slowdowns to identify hotspots?
> > 
> > Thanks
> 
> Yes, it will degrade performance about 2%-%3 may because of the task_lock
> and  atomic operations on the reference count as shown
> in the previous email.
> 
> new hotspots in perf.
> 1.34%  [kernel]          [k] __mpol_put
> 0.53%  [kernel]          [k] _raw_spin_lock
> 0.44%  [kernel]          [k] get_task_policy

Thanks!
-- 
Michal Hocko
SUSE Labs