linux-kernel - Re: [PATCH v2] fs: fsnotify: account fsnotify metadata to kmemcg

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <76a4d544-833a-5f42-a898-115640b6783b@alibaba-inc.com>
Date:   Tue, 31 Oct 2017 00:39:58 +0800
From:   "Yang Shi" <yang.s@...baba-inc.com>
To:     Jan Kara <jack@...e.cz>
Cc:     amir73il@...il.com, linux-fsdevel@...r.kernel.org,
        linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] fs: fsnotify: account fsnotify metadata to kmemcg

On 10/30/17 5:43 AM, Jan Kara wrote:
> On Sat 28-10-17 02:22:18, Yang Shi wrote:
>> If some process generates events into a huge or unlimit event queue, but no
>> listener read them, they may consume significant amount of memory silently
>> until oom happens or some memory pressure issue is raised.
>> It'd better to account those slab caches in memcg so that we can get heads
>> up before the problematic process consume too much memory silently.
>>
>> But, the accounting might be heuristic if the producer is in the different
>> memcg from listener if the listener doesn't read the events. Due to the
>> current design of kmemcg, who does the allocation, who gets the accounting.
>>
>> Signed-off-by: Yang Shi <yang.s@...baba-inc.com>
>> ---
>> v1 --> v2:
>> * Updated commit log per Amir's suggestion
> 
> I'm sorry but I don't think this solution is acceptable. I understand that
> in some cases (and you likely run one of these) the result may *happen* to
> be the desired one but in other cases, you might be charging wrong memcg
> and so misbehaving process in memcg A can effectively cause a DoS attack on
> a process in memcg B.

Yes, as what I discussed with Amir in earlier review, current memcg 
design just accounts memory to the allocation process, but has no idea 
who is consumer process.

Although it is not desirable to DoS a memcg, it still sounds better than 
DoS the whole machine due to potential oom. This patch is aimed to avoid 
such case.

> 
> If you have a setup in which notification events can consume considerable
> amount of resources, you are doing something wrong I think. Standard event
> queue length is limited, overall events are bounded to consume less than 1
> MB. If you have unbounded queue, the process has to be CAP_SYS_ADMIN and
> presumably it has good reasons for requesting unbounded queue and it should
> know what it is doing.

Yes, I agree it does mean something is going wrong. So, it'd better to 
be accounted in order to get some heads up early before something is 
going really bad. The limit will not be set too high since fsnotify 
metadata will not consume too much memory in *normal* case.

I agree we should trust admin user, but kernel should be responsible for 
the last defense when something is really going wrong. And, we can't 
guarantee admin process will not do something wrong, the code might be 
not reviewed thoroughly, the test might not cover some extreme cases.

> 
> So maybe we could come up with some better way to control amount of
> resources consumed by notification events but for that we lack more
> information about your use case. And I maintain that the solution should
> account events to the consumer, not the producer...

I do agree it is not fair and not neat to account to producer rather 
than misbehaving consumer, but current memcg design looks not support 
such use case. And, the other question is do we know who is the listener 
if it doesn't read the events?

Thanks,
Yang

> 
> 								Honza
>