[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aa9e029703e184a56bcab9f0992cfff316136d16.camel@perches.com>
Date: Wed, 06 May 2020 12:06:10 -0700
From: Joe Perches <joe@...ches.com>
To: Christian Brauner <christian.brauner@...ntu.com>,
linux-kernel@...r.kernel.org
Cc: Thomas Gleixner <tglx@...utronix.de>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...nel.org>,
Eugene Syromiatnikov <esyr@...hat.com>,
Christian Kellner <christian@...lner.me>,
Aleksa Sarai <cyphar@...har.com>,
"Dmitry V. Levin" <ldv@...linux.org>,
Arnd Bergmann <arnd@...db.de>, Serge Hallyn <serge@...lyn.com>,
Tejun Heo <tj@...nel.org>, Oleg Nesterov <oleg@...hat.com>,
Jan Stancek <jstancek@...hat.com>,
Andreas Schwab <schwab@...ux-m68k.org>,
Florian Weimer <fw@...eb.enyo.de>, libc-alpha@...rceware.org
Subject: Re: [PATCH] clone: only use lower 32 flag bits
On Tue, 2020-05-05 at 19:44 +0200, Christian Brauner wrote:
> Jan reported an issue where an interaction between sign-extending clone's
> flag argument on ppc64le and the new CLONE_INTO_CGROUP feature causes
> clone() to consistently fail with EBADF.
[]
> Let's fix this by always capping the upper 32 bits for the legacy clone()
> syscall. This ensures that we can't reach clone3() only features by
> accident via legacy clone as with the sign extension case and also that
> legacy clone() works exactly like before, i.e. ignoring any unknown flags.
> This solution risks no regressions and is also pretty clean.
>
> I've chosen u32 and not unsigned int to visually indicate that we're
> capping this to 32 bits.
Perhaps use the lower_32_bits macro?
> diff --git a/kernel/fork.c b/kernel/fork.c
[]
> @@ -2569,12 +2569,21 @@ SYSCALL_DEFINE5(clone, unsigned long, clone_flags, unsigned long, newsp,
> unsigned long, tls)
> #endif
> {
> + /*
> + * On 64 bit unsigned long can be used by userspace to
> + * pass flag values only useable with clone3(). So cap
> + * the flag argument to the lower 32 bits. This is fine,
> + * since legacy clone() has traditionally ignored unknown
> + * flag values. So don't break userspace workloads that
> + * (on accident or on purpose) rely on this.
> + */
> + u32 flags = (u32)clone_flags;
> struct kernel_clone_args args = {
> - .flags = (clone_flags & ~CSIGNAL),
> + .flags = (flags & ~CSIGNAL),
so:
.flags = lower_32_bits(clone_flags) & ~CSIGNAL;
> .pidfd = parent_tidptr,
> .child_tid = child_tidptr,
> .parent_tid = parent_tidptr,
> - .exit_signal = (clone_flags & CSIGNAL),
> + .exit_signal = (flags & CSIGNAL),
.exit_signal = lower_32_bits(clone_flags) & CSIGNAL;
> .stack = newsp,
> .tls = tls,
> };
>
> base-commit: 0e698dfa282211e414076f9dc7e83c1c288314fd
Powered by blists - more mailing lists