linux-kernel - Re: [PATCH RFC 0/5] Virtual Memory Resource Controller for cgroups

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20140723140837.GE30850@esperanza>
Date:	Wed, 23 Jul 2014 18:08:37 +0400
From:	Vladimir Davydov <vdavydov@...allels.com>
To:	Michal Hocko <mhocko@...e.cz>
CC:	<linux-kernel@...r.kernel.org>, <linux-mm@...ck.org>,
	<cgroups@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Tejun Heo <tj@...nel.org>, Li Zefan <lizefan@...wei.com>,
	Johannes Weiner <hannes@...xchg.org>,
	Mel Gorman <mgorman@...e.de>, Rik van Riel <riel@...hat.com>,
	"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
	Hugh Dickins <hughd@...gle.com>,
	David Rientjes <rientjes@...gle.com>,
	Pavel Emelyanov <xemul@...allels.com>,
	Balbir Singh <bsingharora@...il.com>
Subject: Re: [PATCH RFC 0/5] Virtual Memory Resource Controller for cgroups

On Wed, Jul 16, 2014 at 02:01:47PM +0200, Michal Hocko wrote:
> On Fri 04-07-14 19:38:53, Vladimir Davydov wrote:
> > Considering the example I've given above, both of these won't help if
> > the system has other active CTs: the container will be forcefully kept
> > around its high/low limit and, since it's definitely not enough for it,
> > it will be finally killed crossing out the computations it's spent so
> > much time on. High limit won't be good for the container even if there's
> > no other load on the node - it will be constantly swapping out anon
> > memory and evicting file caches. The application won't die quickly then,
> > but it will get a heavy slowdown, which is no better than killing I
> > guess.
> 
> It will get vmpressure notifications though and can help to release
> excessive buffers which were allocated optimistically.

But the user will only get the notification *after* his application has
touched the memory within the limit, which may take quite a long time.

> > Also, I guess it'd be beneficial to have
> > 
> >  - mlocked pages accounting per cgroup, because they affect memory
> >    reclaim, and how low/high limits work, so it'd be nice to have them
> >    limited to a sane value;
> > 
> >  - shmem areas accounting per cgroup, because the total amount of shmem
> >    on the system is limited, and it'll be no good if malicious
> >    containers eat it all.
> > 
> > IMO It wouldn't be a good idea to overwhelm memcg with those limits, the
> > VM controller suits much better.
> 
> yeah, I do not think adding more to memcg is a good idea. I am still not
> sure whether working around bad design decisions in applications is a
> good rationale for a new controller.

Where do you see "bad design decision" in the example I've given above?
To recap, the user doesn't know how much memory his application is going
to consume and he wants to be notified about a potential failure as soon
as possible instead of waiting until it touches all the memory within
the container limit.

Also, what's wrong if an application wants to eat a lot of shared
memory, which is a limited resource? Suppose the user sets memsw.limit
for his container to half of RAM hoping it's isolated and won't cause
any troubles, but eventually he finds other workloads failing on the
host due to the processes inside it has eaten all available shmem.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/