[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <AANLkTi=ghZn9o_7Dab2GQebZUZBZk+um0vYbZZLC3dMf@mail.gmail.com>
Date: Tue, 14 Dec 2010 08:59:35 -0800
From: Josh Hunt <joshhunt00@...il.com>
To: KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
Cc: Rusty Russell <rusty@...tcorp.com.au>,
Milton Miller <miltonm@....com>, Arnd Bergmann <arnd@...db.de>,
Greg KH <greg@...ah.com>, linux-kernel@...r.kernel.org,
anton@...ba.org
Subject: Re: cpumask: fix compat getaffinity
I realize this thread is fairly old, but I have a question about this
statement in the changelog:
> On NUMA systems with less than 32 possibly CPUs, the
> current compat_sys_sched_getaffinity now returns '4'
> instead of the actual NR_CPUS/8, which makes libnuma
> bail out when parsing the cpumap.
This does not seem accurate, at least not on my system with NR_CPUS=32
and nr_cpu_ids=8. The smallest value retlen will ever be set to is
sizeof(long) which is 8 on most modern systems. This breaks the
statement that this function will now return NR_CPUS/8.
Thanks
Josh
On Sun, May 16, 2010 at 11:04 PM, KOSAKI Motohiro
<kosaki.motohiro@...fujitsu.com> wrote:
>> On Wed, 12 May 2010 06:00:45 pm Milton Miller wrote:
>> >
>> > At least for parsing, we need to allocate and parse NR_CPUS until
>> > all places like arch/powerpc/platforms/pseries/xics.c that compare a
>> > user-supplied mask to CPUMASK_ALL are eliminated.
>>
>> Good point. Anton will want to fix those anyway for CONFIG_CPUMASK_OFFSTACK,
>> too, but that's the reason the parsing uses nr_cpumask_bits.
>>
>> > > Would it make sense to use my initial patch for -stable, which reverts
>> > > the ABI back to before the change that caused the problem, but apply
>> > > the correct fix (changing the ABI throughout) for future releases?
>> >
>> > This would definitly be the conservative fix.
>>
>> Instead of changing back to NR_CPUS which will break libnuma for
>> CPUMASK_OFFSTACK, how about changing it to nr_cpumask_bits and having an
>> explicit comment above it:
>
> Yes and No.
>
> 1) sched_getaffinity syscall is used from glibc and libnuma.
> 2) glibc doesn't use the return value almostly. glibc emulate it as NR_CPUS=1024.
> 3) Now, both sched_getaffinity() and compat_sys_sched_getaffinity() have nr_cpu_ids thing.
> 4) But only compat_sys_sched_getaffinity() hit libnuma problem.
>
> I think It mean compat_sys_sched_getaffinity() should behave as sched_getaffinity().
> IOW, libnuma assume compat_sys_sched_getaffinity() return len args or NR_CPUS.
> then, following patch do it. I confirmed the patch works with or without CPUMASK_OFFSTACK.
>
> So, My proposal is
> 1) merge both mine and yours to linus tree
> 2) but backport only mine
>
>
> How do you think?
>
>
> ==========================================================
> From: KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
> Subject: [PATCH] cpumask: fix compat getaffinity
>
> Commit a45185d2d "cpumask: convert kernel/compat.c" broke
> libnuma, which abuses sched_getaffinity to find out NR_CPUS
> in order to parse /sys/devices/system/node/node*/cpumap.
>
> On NUMA systems with less than 32 possibly CPUs, the
> current compat_sys_sched_getaffinity now returns '4'
> instead of the actual NR_CPUS/8, which makes libnuma
> bail out when parsing the cpumap.
>
> The libnuma call sched_getaffinity(0, bitmap, 4096)
> at first. It mean the libnuma expect the return value of
> sched_getaffinity() is either len argument or NR_CPUS.
> But it doesn't expect to return nr_cpu_ids.
>
> Strictly speaking, userland requirement are
>
> 1) Glibc assume the return value mean the lengh of initialized
> of mask argument. E.g. if sched_getaffinity(1024) return 128,
> glibc make zero fill rest 896 byte.
> 2) Libnuma assume the return value can be used to guess NR_CPUS
> in kernel. It assume len-arg<NR_CPUS makes -EINVAL. But
> it try len=4096 at first and 4096 is always bigger than
> NR_CPUS. Then, if we remove strange min_length normalization,
> we never hit -EINVAL case.
>
> sched_getaffinity() already solved this issue. This patch
> adapt compat_sys_sched_getaffinity() as it.
>
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
> ---
> kernel/compat.c | 25 +++++++++++--------------
> 1 files changed, 11 insertions(+), 14 deletions(-)
>
> diff --git a/kernel/compat.c b/kernel/compat.c
> index 7f40e92..5adab05 100644
> --- a/kernel/compat.c
> +++ b/kernel/compat.c
> @@ -495,29 +495,26 @@ asmlinkage long compat_sys_sched_getaffinity(compat_pid_t pid, unsigned int len,
> {
> int ret;
> cpumask_var_t mask;
> - unsigned long *k;
> - unsigned int min_length = cpumask_size();
> -
> - if (nr_cpu_ids <= BITS_PER_COMPAT_LONG)
> - min_length = sizeof(compat_ulong_t);
>
> - if (len < min_length)
> + if ((len * BITS_PER_BYTE) < nr_cpu_ids)
> + return -EINVAL;
> + if (len & (sizeof(compat_ulong_t)-1))
> return -EINVAL;
>
> if (!alloc_cpumask_var(&mask, GFP_KERNEL))
> return -ENOMEM;
>
> ret = sched_getaffinity(pid, mask);
> - if (ret < 0)
> - goto out;
> + if (ret == 0) {
> + size_t retlen = min_t(size_t, len, cpumask_size());
>
> - k = cpumask_bits(mask);
> - ret = compat_put_bitmap(user_mask_ptr, k, min_length * 8);
> - if (ret == 0)
> - ret = min_length;
> -
> -out:
> + if (compat_put_bitmap(user_mask_ptr, cpumask_bits(mask), retlen * 8))
> + ret = -EFAULT;
> + else
> + ret = retlen;
> + }
> free_cpumask_var(mask);
> +
> return ret;
> }
>
> --
> 1.6.5.2
>
>
>
>
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists