linux-kernel - Re: [PATCH -v2 4/6] memcg: make sure that memcg is not offline when charging

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20140205133834.GB2425@dhcp22.suse.cz>
Date:	Wed, 5 Feb 2014 14:38:34 +0100
From:	Michal Hocko <mhocko@...e.cz>
To:	Johannes Weiner <hannes@...xchg.org>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	LKML <linux-kernel@...r.kernel.org>, linux-mm@...ck.org
Subject: Re: [PATCH -v2 4/6] memcg: make sure that memcg is not offline when
 charging

On Tue 04-02-14 11:29:39, Johannes Weiner wrote:
[...]
> Maybe we should remove the XXX if it makes you think we should change
> the current situation by any means necessary.  This patch is not an
> improvement.
>
> I put the XXX there so that we one day maybe refactor the code in a
> clean fashion where try_get_mem_cgroup_from_whatever() is in the same
> rcu section as the first charge attempt.  On failure, reclaim, and do
> the lookup again.

I wouldn't be opposed to such a cleanup. It is not that simple, though.

> Also, this problem only exists on swapin, where the memcg is looked up
> from an auxilliary data structure and not the current task, so maybe
> that would be an angle to look for a clean solution.

I am not so sure about that. Task could have been moved to a different
group basically anytime it was outside of rcu_read_lock section (which
means most of the time). And so the group might get removed and we are
in the very same situation.

> Either way, the problem is currently fixed 

OK, my understanding (and my ack was based on that) was that we needed
a simple and safe fix for the stable trees and we would have something
more appropriate later on. Preventing from the race sounds like a more
appropriate and a better technical solution to me. So I would rather ask
why to keep a workaround in place. Does it add any risk?
Especially when we basically abuse the 2 stage cgroup removal. All the
charges should be cleared out after css_offline.

> with a *oneliner*.

That is really not importat becaust _that_ oneliner abuses the function
which should be in fact called from a different context.

> Unless the alternative solution is inherent in a clean rework of the
> code to match cgroup core lifetime management, I don't see any reason
> to move away from the status quo.

To be honest this sounds like a weak reasoning to refuse a real fix
which replaces a workaround.

This is a second attempt to fix the actual race that you are dismissing
which is really surprising to me. Especially when the workaround is an
ugly hack.
-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/