linux-kernel - Re: [PATCH v2 1/3] mm: memcontrol: fix swap counter leak on swapout from offline cgroup

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160803141203.GA12838@cmpxchg.org>
Date:	Wed, 3 Aug 2016 10:12:03 -0400
From:	Johannes Weiner <hannes@...xchg.org>
To:	Vladimir Davydov <vdavydov@...tuozzo.com>
Cc:	Michal Hocko <mhocko@...nel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	stable@...r.kernel.org, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 1/3] mm: memcontrol: fix swap counter leak on swapout
 from offline cgroup

On Wed, Aug 03, 2016 at 02:46:40PM +0300, Vladimir Davydov wrote:
> On Wed, Aug 03, 2016 at 01:09:42PM +0200, Michal Hocko wrote:
> > On Wed 03-08-16 12:50:49, Vladimir Davydov wrote:
> > > On Tue, Aug 02, 2016 at 06:00:26PM +0200, Michal Hocko wrote:
> > > > On Tue 02-08-16 18:00:48, Vladimir Davydov wrote:
> > > ...
> > > > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > > > > index 3be791afd372..4ae12effe347 100644
> > > > > --- a/mm/memcontrol.c
> > > > > +++ b/mm/memcontrol.c
> > > > > @@ -4036,6 +4036,24 @@ static void mem_cgroup_id_get(struct mem_cgroup *memcg)
> > > > >  	atomic_inc(&memcg->id.ref);
> > > > >  }
> > > > >  
> > > > > +static struct mem_cgroup *mem_cgroup_id_get_active(struct mem_cgroup *memcg)
> > > > > +{
> > > > > +	while (!atomic_inc_not_zero(&memcg->id.ref)) {
> > > > > +		/*
> > > > > +		 * The root cgroup cannot be destroyed, so it's refcount must
> > > > > +		 * always be >= 1.
> > > > > +		 */
> > > > > +		if (memcg == root_mem_cgroup) {
> > > > > +			VM_BUG_ON(1);
> > > > > +			break;
> > > > > +		}
> > > > 
> > > > why not simply VM_BUG_ON(memcg == root_mem_cgroup)?
> > > 
> > > Because with DEBUG_VM disabled we could wind up looping forever here if
> > > the refcount of the root_mem_cgroup got screwed up. On production
> > > kernels, it's better to break the loop and carry on closing eyes on
> > > diverging counters rather than getting a lockup.
> > 
> > Wouldn't this just paper over a real bug? Anyway I will not insist but
> > making the code more complex just to pretend we can handle a situation
> > gracefully doesn't sound right to me.
> 
> But we can handle this IMO. AFAICS diverging id refcount will typically
> result in leaking swap charges, which aren't even a real resource. At
> worst, we can leak an offline mem_cgroup, which is also not critical
> enough to crash the production system.

Agreed. If we have the option to detect and warn about the bug, but
can continue to limp along without causing data corruption, then
that's what we should do.

> I see your concern of papering over a bug though. What about adding a
> warning there?
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 1c0aa59fd333..8c8e68becee9 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -4044,7 +4044,7 @@ static struct mem_cgroup *mem_cgroup_id_get_online(struct mem_cgroup *memcg)
>  		 * The root cgroup cannot be destroyed, so it's refcount must
>  		 * always be >= 1.
>  		 */
> -		if (memcg == root_mem_cgroup) {
> +		if (WARN_ON_ONCE(memcg == root_mem_cgroup)) {
>  			VM_BUG_ON(1);
>  			break;
>  		}

The WARN_ON_ONCE() makes sense to me. But if we warn on all configs
anyway, the VM_BUG_ON() doesn't provide any additional value. Anybody
who is testing new code and enables DEBUG_VM should notice a warning
without requiring the kernel to blow up in their face; it also allows
them to check other state that is not necessarily available in BUG().