linux-kernel - Re: [PATCH v2] fs: fsnotify: account fsnotify metadata to kmemcg

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20171101151534.GC28572@quack2.suse.cz>
Date:   Wed, 1 Nov 2017 16:15:34 +0100
From:   Jan Kara <jack@...e.cz>
To:     Yang Shi <yang.s@...baba-inc.com>
Cc:     Jan Kara <jack@...e.cz>, amir73il@...il.com,
        linux-fsdevel@...r.kernel.org, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, mhocko@...e.cz
Subject: Re: [PATCH v2] fs: fsnotify: account fsnotify metadata to kmemcg

On Wed 01-11-17 00:44:18, Yang Shi wrote:
> On 10/31/17 3:12 AM, Jan Kara wrote:
> >On Tue 31-10-17 00:39:58, Yang Shi wrote:
> >>On 10/30/17 5:43 AM, Jan Kara wrote:
> >>>On Sat 28-10-17 02:22:18, Yang Shi wrote:
> >>>>If some process generates events into a huge or unlimit event queue, but no
> >>>>listener read them, they may consume significant amount of memory silently
> >>>>until oom happens or some memory pressure issue is raised.
> >>>>It'd better to account those slab caches in memcg so that we can get heads
> >>>>up before the problematic process consume too much memory silently.
> >>>>
> >>>>But, the accounting might be heuristic if the producer is in the different
> >>>>memcg from listener if the listener doesn't read the events. Due to the
> >>>>current design of kmemcg, who does the allocation, who gets the accounting.
> >>>>
> >>>>Signed-off-by: Yang Shi <yang.s@...baba-inc.com>
> >>>>---
> >>>>v1 --> v2:
> >>>>* Updated commit log per Amir's suggestion
> >>>
> >>>I'm sorry but I don't think this solution is acceptable. I understand that
> >>>in some cases (and you likely run one of these) the result may *happen* to
> >>>be the desired one but in other cases, you might be charging wrong memcg
> >>>and so misbehaving process in memcg A can effectively cause a DoS attack on
> >>>a process in memcg B.
> >>
> >>Yes, as what I discussed with Amir in earlier review, current memcg design
> >>just accounts memory to the allocation process, but has no idea who is
> >>consumer process.
> >>
> >>Although it is not desirable to DoS a memcg, it still sounds better than DoS
> >>the whole machine due to potential oom. This patch is aimed to avoid such
> >>case.
> >
> >Thinking about this even more, your solution may have even worse impact -
> >due to allocations failing, some applications may avoid generation of fs
> >notification events for actions they do. And that maybe a security issue in
> >case there are other applications using fanotify for security enforcement,
> >virus scanning, or whatever... In such cases it is better to take the
> >whole machine down than to let it run.
> 
> I guess (just guess) this might be able to be solved by Amir's patch, right?
> An overflow or error event will be queued, then the consumer applications
> could do nicer error handling/softer exit.

Well, Amir's patch solves the problem of visibility that something bad
(lost event) happened. But it does not address the fundamental issue that
you account the event to a wrong memcg and thus fail the allocation at
wrong times.

> Actually, the event is dropped when -ENOMEM regardless of my patch. As Amir
> said this patch may just amplify this problem if my understanding is right.

So currently, -ENOMEM cannot normally happen for such small allocation. The
kernel will rather go OOM and kill some process to free memory. So putting
memcgs into the picture changes the behavior.

								Honza
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR