lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100315160454.GA8211@sgi.com>
Date:	Mon, 15 Mar 2010 11:04:54 -0500
From:	Jack Steiner <steiner@....com>
To:	mingo@...hat.com, hpa@...or.com, linux-kernel@...r.kernel.org,
	torvalds@...ux-foundation.org, travis@....com,
	peterz@...radead.org, drepper@...hat.com, rja@....com,
	sharyath@...ibm.com, akpm@...ux-foundation.org, tglx@...utronix.de,
	kosaki.motohiro@...fujitsu.com, mingo@...e.hu
Cc:	linux-tip-commits@...r.kernel.org
Subject: Re: [tip:sched/urgent] sched: sched_getaffinity(): Allow less than
	NR_CPUS length

On Mon, Mar 15, 2010 at 07:43:02AM +0000, tip-bot for KOSAKI Motohiro wrote:
> Commit-ID:  cd3d8031eb4311e516329aee03c79a08333141f1
> Gitweb:     http://git.kernel.org/tip/cd3d8031eb4311e516329aee03c79a08333141f1
> Author:     KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
> AuthorDate: Fri, 12 Mar 2010 16:15:36 +0900
> Committer:  Ingo Molnar <mingo@...e.hu>
> CommitDate: Mon, 15 Mar 2010 08:28:44 +0100

...
> IOW we hope they are not annoyed by this issue ...

The change looks ok but I can't reproduce the problem.

I'm running on a distro kernel that has NR_CPUS=4096.  Glibc has also has a
definition of __CPU_SETSIZE (I assume this change was made by the distro but
am not certain):

    sched.c
    	...
	#if defined _SCHED_H && !defined __cpu_set_t_defined
	# define __cpu_set_t_defined
	/* Size definition for CPU sets.  */
	# define __CPU_SETSIZE  4096
	# define __NCPUBITS     (8 * sizeof (__cpu_mask))

Your test program runs ok:

	% strace t
	...
	sched_getaffinity(0, 512,  { ffff, 0, 0, 0, 0, 0, 0, 0 }) = 64



Also note that we've run on IA64 systems with NR_CPUS=4096 for several years w/o
hitting any problems.

Bottom line. I don't think the change will affect us.


> 
> sched: sched_getaffinity(): Allow less than NR_CPUS length
> 
> [ Note, this commit changes the syscall ABI for > 1024 CPUs systems. ]
> 
> Recently, some distro decided to use NR_CPUS=4096 for mysterious reasons.
> Unfortunately, glibc sched interface has the following definition:
> 
> 	# define __CPU_SETSIZE  1024
> 	# define __NCPUBITS     (8 * sizeof (__cpu_mask))
> 	typedef unsigned long int __cpu_mask;
> 	typedef struct
> 	{
> 	  __cpu_mask __bits[__CPU_SETSIZE / __NCPUBITS];
> 	} cpu_set_t;
> 
> It mean, if NR_CPUS is bigger than 1024, cpu_set_t makes an
> ABI issue ...
> 
> More recently, Sharyathi Nagesh reported following test program makes
> misterious syscall failure:
> 
>  -----------------------------------------------------------------------
>  #define _GNU_SOURCE
>  #include<stdio.h>
>  #include<errno.h>
>  #include<sched.h>
> 
>  int main()
>  {
>      cpu_set_t set;
>      if (sched_getaffinity(0, sizeof(cpu_set_t), &set) < 0)
>          printf("\n Call is failing with:%d", errno);
>  }
>  -----------------------------------------------------------------------
> 
> Because the kernel assumes len argument of sched_getaffinity() is bigger
> than NR_CPUS. But now it is not correct.
> 
> Now we are faced with the following annoying dilemma, due to
> the limitations of the glibc interface built in years ago:
> 
>  (1) if we change glibc's __CPU_SETSIZE definition, we lost
>      binary compatibility of _all_ application.
> 
>  (2) if we don't change it, we also lost binary compatibility of
>      Sharyathi's use case.
> 
> Then, I would propse to change the rule of the len argument of
> sched_getaffinity().
> 
> Old:
> 	len should be bigger than NR_CPUS
> New:
> 	len should be bigger than maximum possible cpu id
> 
> This creates the following behavior:
> 
>  (A) In the real 4096 cpus machine, the above test program still
>      return -EINVAL.
> 
>  (B) NR_CPUS=4096 but the machine have less than 1024 cpus (almost
>      all machines in the world), the above can run successfully.
> 
> Fortunatelly, BIG SGI machine is mainly used for HPC use case. It means
> they can rebuild their programs.
> 
> IOW we hope they are not annoyed by this issue ...
> 
> Reported-by: Sharyathi Nagesh <sharyath@...ibm.com>
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
> Acked-by: Ulrich Drepper <drepper@...hat.com>
> Acked-by: Peter Zijlstra <peterz@...radead.org>
> Cc: Linus Torvalds <torvalds@...ux-foundation.org>
> Cc: Andrew Morton <akpm@...ux-foundation.org>
> Cc: Jack Steiner <steiner@....com>
> Cc: Russ Anderson <rja@....com>
> Cc: Mike Travis <travis@....com>
> LKML-Reference: <20100312161316.9520.A69D9226@...fujitsu.com>
> Signed-off-by: Ingo Molnar <mingo@...e.hu>
> ---
>  kernel/sched.c |   10 +++++++---
>  1 files changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/sched.c b/kernel/sched.c
> index 9ab3cd7..6eaef3d 100644
> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -4902,7 +4902,9 @@ SYSCALL_DEFINE3(sched_getaffinity, pid_t, pid, unsigned int, len,
>  	int ret;
>  	cpumask_var_t mask;
>  
> -	if (len < cpumask_size())
> +	if (len < nr_cpu_ids)
> +		return -EINVAL;
> +	if (len & (sizeof(unsigned long)-1))
>  		return -EINVAL;
>  
>  	if (!alloc_cpumask_var(&mask, GFP_KERNEL))
> @@ -4910,10 +4912,12 @@ SYSCALL_DEFINE3(sched_getaffinity, pid_t, pid, unsigned int, len,
>  
>  	ret = sched_getaffinity(pid, mask);
>  	if (ret == 0) {
> -		if (copy_to_user(user_mask_ptr, mask, cpumask_size()))
> +		int retlen = min(len, cpumask_size());
> +
> +		if (copy_to_user(user_mask_ptr, mask, retlen))
>  			ret = -EFAULT;
>  		else
> -			ret = cpumask_size();
> +			ret = retlen;
>  	}
>  	free_cpumask_var(mask);
>  
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ