linux-kernel - Re: [RFC PATCH] mm: memcg: fix css double put in mem_cgroup

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20170727065314.GC20970@dhcp22.suse.cz>
Date:   Thu, 27 Jul 2017 08:53:15 +0200
From:   Michal Hocko <mhocko@...nel.org>
To:     Wenwei Tao <wenwei.tww@...il.com>
Cc:     Johannes Weiner <hannes@...xchg.org>,
        Balbir Singh <bsingharora@...il.com>,
        kamezawa.hiroyu@...fujitsu.com, yuwang.yuwang@...baba-inc.com,
        cgroups@...r.kernel.org, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org,
        Wenwei Tao <wenwei.tww@...baba-inc.com>
Subject: Re: [RFC PATCH] mm: memcg: fix css double put in mem_cgroup_iter

On Thu 27-07-17 11:30:50, Wenwei Tao wrote:
> 2017-07-26 21:44 GMT+08:00 Michal Hocko <mhocko@...nel.org>:
> > On Wed 26-07-17 21:07:42, Wenwei Tao wrote:
[...]
> >> I think there is a css double put in mem_cgroup_iter. Under reclaim,
> >> we call mem_cgroup_iter the first time with prev == NULL, and we get
> >> last_visited memcg from per zone's reclaim_iter then call __mem_cgroup_iter_next
> >> try to get next alive memcg, __mem_cgroup_iter_next could return NULL
> >> if last_visited is already the last one so we put the last_visited's
> >> memcg css and continue to the next while loop, this time we might not
> >> do css_tryget(&last_visited->css) if the dead_count is changed, but
> >> we still do css_put(&last_visited->css), we put it twice, this could
> >> trigger the BUG_ON at kernel/cgroup.c:893.
> >
> > Yes, I guess your are right and I suspect that this has been silently
> > fixed by 519ebea3bf6d ("mm: memcontrol: factor out reclaim iterator
> > loading and updating"). I think a more appropriate fix is would be.
> > Are you able to reproduce and re-test it?
> > ---
> 
> Yes, I think this commit can fix this issue, and I backport this
> commit to 3.10.107 kernel and cannot reproduce this issue. I guess
> this commit might need to be backported to 3.10.y stable kernel.

Please send it to the kernel-stable mailing list. 3.10 seems to be still
maintained.

-- 
Michal Hocko
SUSE Labs