linux-kernel - Re: [RFC] Making memcg track ownership per address_space or anon

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <54D25DBD.5080009@yandex-team.ru>
Date:	Wed, 04 Feb 2015 20:58:21 +0300
From:	Konstantin Khlebnikov <khlebnikov@...dex-team.ru>
To:	Tejun Heo <tj@...nel.org>
CC:	Greg Thelen <gthelen@...gle.com>,
	Johannes Weiner <hannes@...xchg.org>,
	Michal Hocko <mhocko@...e.cz>,
	Cgroups <cgroups@...r.kernel.org>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Jan Kara <jack@...e.cz>, Dave Chinner <david@...morbit.com>,
	Jens Axboe <axboe@...nel.dk>,
	Christoph Hellwig <hch@...radead.org>,
	Li Zefan <lizefan@...wei.com>, Hugh Dickins <hughd@...gle.com>,
	Roman Gushchin <klamm@...dex-team.ru>
Subject: Re: [RFC] Making memcg track ownership per address_space or anon_vma

On 04.02.2015 20:15, Tejun Heo wrote:
> Hello,
>
> On Wed, Feb 04, 2015 at 01:49:08PM +0300, Konstantin Khlebnikov wrote:
>> I think important shared data must be handled and protected explicitly.
>> That 'catch-all' shared container could be separated into several
>
> I kinda disagree.  That'd be a major pain in the ass to use and you
> wouldn't know when you got something wrong unless it actually goes
> wrong and you know enough about the innerworkings to look for that.
> Doesn't sound like a sound design to me.
>
>> memory cgroups depending on importance of files: glibc protected
>> with soft guarantee, less important stuff is placed into another
>> cgroup and cannot push top-priority libraries out of ram.
>
> That sounds extremely painful.

I mean this thing _could_ be controlled more precisely. Even if default
policy works for 99% users manual override is still required for 1% or
if something goes wrong.

>
>> If shared files are free for use then that 'shared' container must be
>> ready to keep them in memory. Otherwise this need to be fixed at the
>> container side: we could ignore mlock for shared inodes or amount of
>> such vmas might be limited in per-container basis.
>>
>> But sharing responsibility for shared file is vague concept: memory
>> usage and limit of container must depends only on its own behavior not
>> on neighbors at the same machine.
>>
>>
>> Generally incidental sharing could be handled as temporary sharing:
>> default policy (if inode isn't pinned to memory cgroup) after some
>> time should detect that inode is no longer shared and migrate it into
>> original cgroup. Of course task could provide hit: O_NO_MOVEMEM or
>> even while memory cgroup where it runs could be marked as "scanner"
>> which shouldn't disturb memory classification.
>
> Ditto for annotating each file individually.  Let's please try to stay
> away from things like that.  That's mostly a cop-out which is unlikely
> to actually benefit the majority of users.

Process which scans all files once isn't so rare use case.
Linux still cannot handle this pattern sometimes.

>
>> I've missed obvious solution for controlling memory cgroup for files:
>> project id. This persistent integer id stored in file system. For now
>> it's implemented only for xfs and used for quota which is orthogonal
>> to user/group quotas. We could map some of project id to memory cgroup.
>> That is more flexible than per-superblock mark, has no conflicts like
>> mark on bind-mount.
>
> Again, hell, no.
>
> Thanks.
>

-- 
Konstantin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/