linux-kernel - Re: [BUG] cgroup/workques/fork: deadlock when moving cgroups

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160415150815.GM32377@dhcp22.suse.cz>
Date:	Fri, 15 Apr 2016 17:08:15 +0200
From:	Michal Hocko <mhocko@...nel.org>
To:	Tejun Heo <tj@...nel.org>
Cc:	Johannes Weiner <hannes@...xchg.org>,
	Petr Mladek <pmladek@...e.com>, cgroups@...r.kernel.org,
	Cyril Hrubis <chrubis@...e.cz>, linux-kernel@...r.kernel.org
Subject: Re: [BUG] cgroup/workques/fork: deadlock when moving cgroups

On Fri 15-04-16 10:38:15, Tejun Heo wrote:
> Hello, Michal.
> 
> On Fri, Apr 15, 2016 at 09:06:01AM +0200, Michal Hocko wrote:
> > Tejun was proposing to do the migration async (move the whole
> > mem_cgroup_move_charge into the work item). This would solve the problem
> > of course. I haven't checked whether this would be safe but it at least
> > sounds doable (albeit far from trivial). It would also be a user visible
> > change because the new memcg will not contain the moved charges after we
> > return to user space. I think this would be acceptable but if somebody
> 
> Not necessarily.  The only thing necessary is flushing the work item
> after releasing locks but before returning to user.
> cpuset_post_attach_flush() does exactly the same thing.

Ahh, ok, didn't know that __cgroup_procs_write is doing something
controller specific. Yes then we wouldn't need a generic callback if
another code like above would be acceptable.

> > really relies on the previous behavior I guess we can solve it with a
> > post_move cgroup callback which would be called from a lockless context.
> > 
> > Anyway, before we go that way, can we at least consider the possibility
> > of removing the kworker creation dependency on the global rwsem? AFAIU
> > this locking was added because of the pid controller. Do we even care
> > about something as volatile as kworkers in the pid controller?
> 
> It's not just pid controller and the global percpu locking has lower

where else would the locking matter? I have only checked the git history
to build my picture so I might be missing something of course.

> hotpath overhead.  We can try to exclude kworkers out of the locking
> but that can get really nasty and there are already attempts to add
> cgroup support to workqueue.  Will think more about it.  For now tho,
> do you think making charge moving async would be difficult?

Well it certainly is not that trivial because it relies on being
exclusive with global context. I will have to look closer of course but
I cannot guarantee I will get to it before I get back from LSF. We can
certainly discuss that at the conference. Johannes will be there as
well.

Thanks!
-- 
Michal Hocko
SUSE Labs