lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 21 Oct 2017 05:07:17 +0800
From:   "Yang Shi" <yang.s@...baba-inc.com>
To:     Amir Goldstein <amir73il@...il.com>
Cc:     Jan Kara <jack@...e.cz>,
        linux-fsdevel <linux-fsdevel@...r.kernel.org>,
        linux-mm@...ck.org, linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH] fs: fsnotify: account fsnotify metadata to kmemcg



On 10/19/17 8:14 PM, Amir Goldstein wrote:
> On Fri, Oct 20, 2017 at 12:20 AM, Yang Shi <yang.s@...baba-inc.com> wrote:
>> We observed some misbehaved user applications might consume significant
>> amount of fsnotify slabs silently. It'd better to account those slabs in
>> kmemcg so that we can get heads up before misbehaved applications use too
>> much memory silently.
> 
> In what way do they misbehave? create a lot of marks? create a lot of events?
> Not reading events in their queue?

It looks both a lot marks and events. I'm not sure if it is the latter 
case. If I knew more about the details of the behavior, I would 
elaborated more in the commit log.

> The latter case is more interesting:
> 
> Process A is the one that asked to get the events.
> Process B is the one that is generating the events and queuing them on
> the queue that is owned by process A, who is also to blame if the queue
> is not being read.

I agree it is not fair to account the memory to the generator. But, 
afaik, accounting non-current memcg is not how memcg is designed and 
works. Please see the below for some details.

> 
> So why should process B be held accountable for memory pressure
> caused by, say, an FAN_UNLIMITED_QUEUE that process A created and
> doesn't read from.
> 
> Is it possible to get an explicit reference to the memcg's  events cache
> at fsnotify_group creation time, store it in the group struct and then allocate
> events from the event cache associated with the group (the listener) rather
> than the cache associated with the task generating the event?

I don't think current memcg design can do this. Because kmem accounting 
happens at allocation (when calling kmem_cache_alloc) stage, and get the 
associated memcg from current task, so basically who does the allocation 
who get it accounted. If the producer is in the different memcg of 
consumer, it should be just accounted to the producer memcg, although 
the problem might be caused by the producer.

However, afaik, both producer and consumer are typically in the same 
memcg. So, this might be not a big issue. But, I do admit such unfair 
accounting may happen.

Thanks,
Yang

> 
> Amir.
> 
>>
>> Signed-off-by: Yang Shi <yang.s@...baba-inc.com>
>> ---
>>   fs/notify/dnotify/dnotify.c        | 4 ++--
>>   fs/notify/fanotify/fanotify_user.c | 6 +++---
>>   fs/notify/fsnotify.c               | 2 +-
>>   fs/notify/inotify/inotify_user.c   | 2 +-
>>   4 files changed, 7 insertions(+), 7 deletions(-)
>>
>> diff --git a/fs/notify/dnotify/dnotify.c b/fs/notify/dnotify/dnotify.c
>> index cba3283..3ec6233 100644
>> --- a/fs/notify/dnotify/dnotify.c
>> +++ b/fs/notify/dnotify/dnotify.c
>> @@ -379,8 +379,8 @@ int fcntl_dirnotify(int fd, struct file *filp, unsigned long arg)
>>
>>   static int __init dnotify_init(void)
>>   {
>> -       dnotify_struct_cache = KMEM_CACHE(dnotify_struct, SLAB_PANIC);
>> -       dnotify_mark_cache = KMEM_CACHE(dnotify_mark, SLAB_PANIC);
>> +       dnotify_struct_cache = KMEM_CACHE(dnotify_struct, SLAB_PANIC|SLAB_ACCOUNT);
>> +       dnotify_mark_cache = KMEM_CACHE(dnotify_mark, SLAB_PANIC|SLAB_ACCOUNT);
>>
>>          dnotify_group = fsnotify_alloc_group(&dnotify_fsnotify_ops);
>>          if (IS_ERR(dnotify_group))
>> diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
>> index 907a481..7d62dee 100644
>> --- a/fs/notify/fanotify/fanotify_user.c
>> +++ b/fs/notify/fanotify/fanotify_user.c
>> @@ -947,11 +947,11 @@ static int fanotify_add_inode_mark(struct fsnotify_group *group,
>>    */
>>   static int __init fanotify_user_setup(void)
>>   {
>> -       fanotify_mark_cache = KMEM_CACHE(fsnotify_mark, SLAB_PANIC);
>> -       fanotify_event_cachep = KMEM_CACHE(fanotify_event_info, SLAB_PANIC);
>> +       fanotify_mark_cache = KMEM_CACHE(fsnotify_mark, SLAB_PANIC|SLAB_ACCOUNT);
>> +       fanotify_event_cachep = KMEM_CACHE(fanotify_event_info, SLAB_PANIC|SLAB_ACCOUNT);
>>   #ifdef CONFIG_FANOTIFY_ACCESS_PERMISSIONS
>>          fanotify_perm_event_cachep = KMEM_CACHE(fanotify_perm_event_info,
>> -                                               SLAB_PANIC);
>> +                                               SLAB_PANIC|SLAB_ACCOUNT);
>>   #endif
>>
>>          return 0;
>> diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
>> index 0c4583b..82620ac 100644
>> --- a/fs/notify/fsnotify.c
>> +++ b/fs/notify/fsnotify.c
>> @@ -386,7 +386,7 @@ static __init int fsnotify_init(void)
>>                  panic("initializing fsnotify_mark_srcu");
>>
>>          fsnotify_mark_connector_cachep = KMEM_CACHE(fsnotify_mark_connector,
>> -                                                   SLAB_PANIC);
>> +                                                   SLAB_PANIC|SLAB_ACCOUNT);
>>
>>          return 0;
>>   }
>> diff --git a/fs/notify/inotify/inotify_user.c b/fs/notify/inotify/inotify_user.c
>> index 7cc7d3f..57b32ff 100644
>> --- a/fs/notify/inotify/inotify_user.c
>> +++ b/fs/notify/inotify/inotify_user.c
>> @@ -785,7 +785,7 @@ static int __init inotify_user_setup(void)
>>
>>          BUG_ON(hweight32(ALL_INOTIFY_BITS) != 21);
>>
>> -       inotify_inode_mark_cachep = KMEM_CACHE(inotify_inode_mark, SLAB_PANIC);
>> +       inotify_inode_mark_cachep = KMEM_CACHE(inotify_inode_mark, SLAB_PANIC|SLAB_ACCOUNT);
>>
>>          inotify_max_queued_events = 16384;
>>          init_user_ns.ucount_max[UCOUNT_INOTIFY_INSTANCES] = 128;
>> --
>> 1.8.3.1
>>

Powered by blists - more mailing lists