lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20151110080709.GR31308@esperanza>
Date:	Tue, 10 Nov 2015 11:07:09 +0300
From:	Vladimir Davydov <vdavydov@...tuozzo.com>
To:	Michal Hocko <mhocko@...nel.org>
CC:	Andrew Morton <akpm@...ux-foundation.org>,
	Johannes Weiner <hannes@...xchg.org>,
	Tejun Heo <tj@...nel.org>, Greg Thelen <gthelen@...gle.com>,
	<linux-mm@...ck.org>, <cgroups@...r.kernel.org>,
	<linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 5/5] Account certain kmem allocations to memcg

On Mon, Nov 09, 2015 at 03:39:55PM +0100, Michal Hocko wrote:
> On Sat 07-11-15 23:07:09, Vladimir Davydov wrote:
> > This patch marks those kmem allocations that are known to be easily
> > triggered from userspace as __GFP_ACCOUNT, which makes them accounted to
> > memcg. For the list, see below:
> > 
> >  - threadinfo
> >  - task_struct
> >  - task_delay_info
> >  - pid
> >  - cred
> >  - mm_struct
> >  - vm_area_struct and vm_region (nommu)
> >  - anon_vma and anon_vma_chain
> >  - signal_struct
> >  - sighand_struct
> >  - fs_struct
> >  - files_struct
> >  - fdtable and fdtable->full_fds_bits
> >  - dentry and external_name
> >  - inode for all filesystems. This is the most tedious part, because
> >    most filesystems overwrite the alloc_inode method. Looks like using
> >    __GFP_ACCOUNT in alloc_inode is going to become a new rule, like
> >    passing SLAB_RECLAIM_ACCOUNT on inode cache creation.
> 
> I am wondering whether using a helper function to allocate an inode
> cache would help in that regards. It would limit __GFP_ACCOUNT
> penetration into fs code.

I'm afraid that wouldn't free fs code from the need to use
__GFP_ACCOUNT, because there are other things that we might want to
account AFAICS, e.g. ext4_crypt_info_cachep or ext4_es_cachep.

> 
> pipe buffers are trivial to abuse (e.g. via fd passing) so we want to

You might also mention allocations caused by select/poll, page tables,
radix_tree_node, etc. They all might be abused, but the primary purpose
of this patch set is not catching abusers, but providing reasonable
level of isolation for most normal workloads. Let's add everything above
that in separate patches.

> cap those as well. The following should do the trick AFAICS.

Actually, no - you only account pipe metadata while anon pipe buffer
pages, which usually constitute most of memory consumed by a pipe, still
go unaccounted. I'm planning to make pipe accountable later.

> ---
> diff --git a/fs/pipe.c b/fs/pipe.c
> index 8865f7963700..c4b7e8c08362 100644
> --- a/fs/pipe.c
> +++ b/fs/pipe.c
> @@ -590,7 +590,7 @@ struct pipe_inode_info *alloc_pipe_info(void)
>  
>  	pipe = kzalloc(sizeof(struct pipe_inode_info), GFP_KERNEL);
>  	if (pipe) {
> -		pipe->bufs = kzalloc(sizeof(struct pipe_buffer) * PIPE_DEF_BUFFERS, GFP_KERNEL);
> +		pipe->bufs = kzalloc(sizeof(struct pipe_buffer) * PIPE_DEF_BUFFERS, GFP_KERNEL | __GFP_ACCOUNT);

GFP_KERNEL | __GFP_ACCOUNT are used really often, that's why I
introduced GFP_KERNEL_ACCOUNT.

Thanks,
Vladimir
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ