lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu,  5 Feb 2009 11:03:09 +0900 (JST)
From:	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
To:	Ravikiran G Thirumalai <kiran@...lex86.org>
Cc:	kosaki.motohiro@...fujitsu.com, wli@...ementarian.org,
	Andrew Morton <akpm@...ux-foundation.org>,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	shai@...lex86.org, Mel Gorman <mel@....ul.ie>,
	Nishanth Aravamudan <nacc@...ibm.com>
Subject: Re: [patch] mm: Fix SHM_HUGETLB to work with users in hugetlb_shm_group

(cc to Mel and Nishanth)

I think this requirement is reasonable. but I also hope Mel or Nishanth
review this.


<<intentionally full quote>>

> On Wed, Feb 04, 2009 at 05:11:21PM -0500, wli@...ementarian.org wrote:
> >On Wed, Feb 04, 2009 at 02:04:28PM -0800, Ravikiran G Thirumalai wrote:
> >> ...
> >> As I see it we have the following options to fix this inconsistency:
> >> 1. Do not depend on RLIMIT_MEMLOCK for hugetlb shm mappings.  If a user
> >>    has CAP_IPC_LOCK or if user belongs to /proc/sys/vm/hugetlb_shm_group,
> >>    he should be able to use shm memory according to shmmax and shmall OR
> >> 2. Update the hugetlbpage documentation to mention the resource limit based
> >>    limitation, and remove the useless /proc/sys/vm/hugetlb_shm_group sysctl
> >> Which one is better?  I am leaning towards 1. and have a patch ready for 1.
> >> but I might be missing some historical reason for using RLIMIT_MEMLOCK with
> >> SHM_HUGETLB.
> >
> >We should do (1) because the hugetlb_shm_group and CAP_IPC_LOCK bits
> >should both continue to work as they did prior to RLIMIT_MEMLOCK -based
> >management of hugetlb. Please make sure the new RLIMIT_MEMLOCK -based
> >management still enables hugetlb shm when hugetlb_shm_group and
> >CAP_IPC_LOCK don't apply.
> >
> 
> OK, here's the patch.
> 
> Thanks,
> Kiran
> 
> 
> Fix hugetlb subsystem so that non root users belonging to hugetlb_shm_group
> can actually allocate hugetlb backed shm.
> 
> Currently non root users cannot even map one large page using SHM_HUGETLB
> when they belong to the gid in /proc/sys/vm/hugetlb_shm_group.
> This is because allocation size is verified against RLIMIT_MEMLOCK resource
> limit even if the user belongs to hugetlb_shm_group.
> 
> This patch
> 1. Fixes hugetlb subsystem so that users with CAP_IPC_LOCK and users
>    belonging to hugetlb_shm_group don't need to be restricted with
>    RLIMIT_MEMLOCK resource limits
> 2. If a user has sufficient memlock limit he can still allocate the hugetlb
>    shm segment.
> 
> Signed-off-by: Ravikiran Thirumalai <kiran@...lex86.org>
> 
> ---
> 
>  Documentation/vm/hugetlbpage.txt |   11 ++++++-----
>  fs/hugetlbfs/inode.c             |   18 ++++++++++++------
>  include/linux/mm.h               |    2 ++
>  mm/mlock.c                       |   11 ++++++++---
>  4 files changed, 28 insertions(+), 14 deletions(-)
> 
> Index: linux-2.6-tip/fs/hugetlbfs/inode.c
> ===================================================================
> --- linux-2.6-tip.orig/fs/hugetlbfs/inode.c	2009-02-04 15:21:45.000000000 -0800
> +++ linux-2.6-tip/fs/hugetlbfs/inode.c	2009-02-04 15:23:19.000000000 -0800
> @@ -943,8 +943,15 @@ static struct vfsmount *hugetlbfs_vfsmou
>  static int can_do_hugetlb_shm(void)
>  {
>  	return likely(capable(CAP_IPC_LOCK) ||
> -			in_group_p(sysctl_hugetlb_shm_group) ||
> -			can_do_mlock());
> +			in_group_p(sysctl_hugetlb_shm_group));
> +}
> +
> +static void acct_huge_shm_lock(size_t size, struct user_struct *user)
> +{
> +	unsigned long pages = (size + PAGE_SIZE - 1) >> PAGE_SHIFT;
> +	spin_lock(&shmlock_user_lock);
> +	acct_shm_lock(pages, user);
> +	spin_unlock(&shmlock_user_lock);
>  }
>  
>  struct file *hugetlb_file_setup(const char *name, size_t size)
> @@ -959,12 +966,11 @@ struct file *hugetlb_file_setup(const ch
>  	if (!hugetlbfs_vfsmount)
>  		return ERR_PTR(-ENOENT);
>  
> -	if (!can_do_hugetlb_shm())
> +	if (can_do_hugetlb_shm())
> +		acct_huge_shm_lock(size, user);
> +	else if (!user_shm_lock(size, user))
>  		return ERR_PTR(-EPERM);
>  
> -	if (!user_shm_lock(size, user))
> -		return ERR_PTR(-ENOMEM);
> -
>  	root = hugetlbfs_vfsmount->mnt_root;
>  	quick_string.name = name;
>  	quick_string.len = strlen(quick_string.name);
> Index: linux-2.6-tip/include/linux/mm.h
> ===================================================================
> --- linux-2.6-tip.orig/include/linux/mm.h	2009-02-04 15:21:45.000000000 -0800
> +++ linux-2.6-tip/include/linux/mm.h	2009-02-04 15:23:19.000000000 -0800
> @@ -737,8 +737,10 @@ extern unsigned long shmem_get_unmapped_
>  #endif
>  
>  extern int can_do_mlock(void);
> +extern void acct_shm_lock(unsigned long, struct user_struct *);
>  extern int user_shm_lock(size_t, struct user_struct *);
>  extern void user_shm_unlock(size_t, struct user_struct *);
> +extern spinlock_t shmlock_user_lock;
>  
>  /*
>   * Parameter block passed down to zap_pte_range in exceptional cases.
> Index: linux-2.6-tip/mm/mlock.c
> ===================================================================
> --- linux-2.6-tip.orig/mm/mlock.c	2009-02-04 15:21:45.000000000 -0800
> +++ linux-2.6-tip/mm/mlock.c	2009-02-04 15:23:19.000000000 -0800
> @@ -637,7 +637,13 @@ SYSCALL_DEFINE0(munlockall)
>   * Objects with different lifetime than processes (SHM_LOCK and SHM_HUGETLB
>   * shm segments) get accounted against the user_struct instead.
>   */
> -static DEFINE_SPINLOCK(shmlock_user_lock);
> +DEFINE_SPINLOCK(shmlock_user_lock);
> +
> +void acct_shm_lock(unsigned long pages, struct user_struct *user)
> +{
> +	get_uid(user);
> +	user->locked_shm += pages;
> +}
>  
>  int user_shm_lock(size_t size, struct user_struct *user)
>  {
> @@ -653,8 +659,7 @@ int user_shm_lock(size_t size, struct us
>  	if (!allowed &&
>  	    locked + user->locked_shm > lock_limit && !capable(CAP_IPC_LOCK))
>  		goto out;
> -	get_uid(user);
> -	user->locked_shm += locked;
> +	acct_shm_lock(locked, user);
>  	allowed = 1;
>  out:
>  	spin_unlock(&shmlock_user_lock);
> Index: linux-2.6-tip/Documentation/vm/hugetlbpage.txt
> ===================================================================
> --- linux-2.6-tip.orig/Documentation/vm/hugetlbpage.txt	2009-02-04 15:21:45.000000000 -0800
> +++ linux-2.6-tip/Documentation/vm/hugetlbpage.txt	2009-02-04 15:23:19.000000000 -0800
> @@ -147,11 +147,12 @@ used to change the file attributes on hu
>  
>  Also, it is important to note that no such mount command is required if the
>  applications are going to use only shmat/shmget system calls.  Users who
> -wish to use hugetlb page via shared memory segment should be a member of
> -a supplementary group and system admin needs to configure that gid into
> -/proc/sys/vm/hugetlb_shm_group.  It is possible for same or different
> -applications to use any combination of mmaps and shm* calls, though the
> -mount of filesystem will be required for using mmap calls.
> +wish to use hugetlb page via shared memory segment should either have
> +sufficient memlock resource limits or, they need to be a member of
> +a supplementary group, and system admin needs to configure that gid into
> +/proc/sys/vm/hugetlb_shm_group. It is possible for same or different
> +applications to use any combination of mmaps and shm* calls, though
> +the mount of filesystem will be required for using mmap calls.
>  
>  *******************************************************************
>  
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@...ck.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@...ck.org"> email@...ck.org </a>



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ