lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20131216104042.GC23582@dhcp22.suse.cz>
Date:	Mon, 16 Dec 2013 11:40:42 +0100
From:	Michal Hocko <mhocko@...e.cz>
To:	Li Zefan <lizefan@...wei.com>
Cc:	Hugh Dickins <hughd@...gle.com>,
	Johannes Weiner <hannes@...xchg.org>,
	Tejun Heo <tj@...nel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	linux-mm@...ck.org, cgroups@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: 3.13-rc breaks MEMCG_SWAP

On Mon 16-12-13 10:53:45, Michal Hocko wrote:
> On Mon 16-12-13 17:36:09, Li Zefan wrote:
> > On 2013/12/16 16:36, Hugh Dickins wrote:
> > > CONFIG_MEMCG_SWAP is broken in 3.13-rc.  Try something like this:
> > > 
> > > mkdir -p /tmp/tmpfs /tmp/memcg
> > > mount -t tmpfs -o size=1G tmpfs /tmp/tmpfs
> > > mount -t cgroup -o memory memcg /tmp/memcg
> > > mkdir /tmp/memcg/old
> > > echo 512M >/tmp/memcg/old/memory.limit_in_bytes
> > > echo $$ >/tmp/memcg/old/tasks
> > > cp /dev/zero /tmp/tmpfs/zero 2>/dev/null
> > > echo $$ >/tmp/memcg/tasks
> > > rmdir /tmp/memcg/old
> > > sleep 1	# let rmdir work complete
> > > mkdir /tmp/memcg/new
> > > umount /tmp/tmpfs
> > > dmesg | grep WARNING
> > > rmdir /tmp/memcg/new
> > > umount /tmp/memcg
> > > 
> > > Shows lots of WARNING: CPU: 1 PID: 1006 at kernel/res_counter.c:91
> > >                            res_counter_uncharge_locked+0x1f/0x2f()
> > > 
> > > Breakage comes from 34c00c319ce7 ("memcg: convert to use cgroup id").
> > > 
> > > The lifetime of a cgroup id is different from the lifetime of the
> > > css id it replaced: memsw's css_get()s do nothing to hold on to the
> > > old cgroup id, it soon gets recycled to a new cgroup, which then
> > > mysteriously inherits the old's swap, without any charge for it.
> > > (I thought memsw's particular need had been discussed and was
> > > well understood when 34c00c319ce7 went in, but apparently not.)
> > > 
> > > The right thing to do at this stage would be to revert that and its
> > > associated commits; but I imagine to do so would be unwelcome to
> > > the cgroup guys, going against their general direction; and I've
> > > no idea how embedded that css_id removal has become by now.
> > > 
> > > Perhaps some creative refcounting can rescue memsw while still
> > > using cgroup id?
> > > 
> > 
> > Sorry for the broken.
> > 
> > I think we can keep the cgroup->id until the last css reference is
> > dropped and the css is scheduled to be destroyed.
> 
> How would this work? The task which pushed the memory to the swap is
> still alive (living in a different group) and the swap will be there
> after the last reference to css as well.

Or did you mean to get css reference in swap_cgroup_record and release
it in __mem_cgroup_try_charge_swapin?

That would prevent the warning (assuming idr_remove would move to
css_free[1]) but I am not sure this is the right thing to do. memsw charges
will be accounted to the parent already (assuming there is one) without
anybody to uncharge them because all uncharges would fallback to the
root memcg after css_offline.

Hugh's approach seems much better.

---
[1] Is this even possible? I cannot say I would understand the comment
above idr_remove in cgroup_destroy_css_killed 100% but it suggests we
cannot postpone it to later.
-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ