[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1944484270.14303.1594825847550.JavaMail.zimbra@efficios.com>
Date: Wed, 15 Jul 2020 11:10:47 -0400 (EDT)
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Christian Brauner <christian.brauner@...ntu.com>
Cc: Florian Weimer <fw@...eb.enyo.de>,
Peter Zijlstra <peterz@...radead.org>,
linux-kernel <linux-kernel@...r.kernel.org>,
Thomas Gleixner <tglx@...utronix.de>,
paulmck <paulmck@...ux.ibm.com>,
Boqun Feng <boqun.feng@...il.com>,
"H. Peter Anvin" <hpa@...or.com>, Paul Turner <pjt@...gle.com>,
linux-api <linux-api@...r.kernel.org>, carlos <carlos@...hat.com>
Subject: Re: [RFC PATCH 2/4] rseq: Allow extending struct rseq
----- On Jul 15, 2020, at 8:33 AM, Christian Brauner christian.brauner@...ntu.com wrote:
[...]
>
> So here's a very free-wheeling draft of roughly what I had in mind. Not
> even compile-tested just to illustrate what I'd change and sorry if that
> code will make you sob in your hands:
>
[...]
> + /*
> + * With __rseq_abi.flags RSEQ_TLS_FLAG_SIZE set, user_size should be
> + * statically initialized to offsetof(struct rseq, end).
> + */
> + __u32 user_size;
> + /*
> + * With __rseq_abi.flags RSEQ_TLS_FLAG_SIZE set, if the kernel supports
> + * extensible struct rseq ABI, the kernel_size field is populated by
> + * the kernel to the minimum between user_size and the offset of the
> + * "end" field within the struct rseq supported by the kernel on
> + * successful registration. Should be initialized to 0.
> + */
> + __u32 kernel_size;
Moving from __u16 to __u32 for both fields don't achieve much, and increase
the size of struct rseq (excluding padding) from 24 bytes to 28 bytes.
Note that the struct rseq alignment is 32 bytes. At 24 bytes, it leaves room
for exactly one 8 bytes pointer, which can be useful for future extensions.
If the size is increased to 28 bytes or more, then we're done and cannot
add a pointer.
> + __u32 active_size;
This additional field takes the very last bytes of padding we have in the
current layout.
> } __attribute__((aligned(4 * sizeof(__u64))));
>
> +#define RSEQ_SIZE_VER0 24 /* sizeof first published struct */
This is incorrect. The sizeof(struct_rseq) with its 32 bytes alignment is 32,
not 24. The padding at the end of the structure is considered as part of its
size, but we cannot rely on its content being zero-initialized based on the
C standard.
> +#define RSEQ_SIZE_VER1 32 /* sizeof second published struct */
> +#define RSEQ_SIZE_LATEST RSEQ_SIZE_VER1 /* sizeof last published struct */
> +
[...]
> @@ -349,10 +363,58 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32,
> rseq_len,
> * ensure the provided rseq is properly aligned and valid.
> */
> if (!IS_ALIGNED((unsigned long)rseq, __alignof__(*rseq)) ||
> - rseq_len != sizeof(*rseq))
> + rseq_len < RSEQ_SIZE_VER0)
This could perhaps be changed for future kernels, but will break for existing
kernels as soon as rseq_len is increased. This is something we should have
planned for in the initial implementation of the system call, but here we are.
How do you envision that userspace would handle this failure from older kernels ?
Try again with a second system call passing RSEQ_SIZE_VER0 as argument ?
> return -EINVAL;
> if (!access_ok(rseq, rseq_len))
> return -EFAULT;
> +
> + /* Handle extensible struct rseq ABI. */
> + ret = get_user(tls_flags, &rseq->flags);
> + if (ret)
> + return ret;
> + if (tls_flags & RSEQ_TLS_FLAG_SIZE) {
> + u32 user_size, kernel_size, active_size;
> +
> + /* Can probably be made nicer by using check_zeroed_user(). */
> + ret = get_user(user_size, &rseq->user_size);
> + if (ret)
> + return ret;
> + if (user_size != 0)
> + return -EINVAL;
> +
> + ret = get_user(active_size, &rseq->active_size);
> + if (ret)
> + return ret;
> + if (active_size != 0)
> + return -EINVAL;
> +
> + ret = get_user(active_size, &rseq->kernel_size);
I guess you mean kernel_size here.
> + if (ret)
> + return ret;
> + if (kernel_size != 0)
> + return -EINVAL;
> +
> + /* Calculate the useable size. */
> + active_size = min_t(u32, rseq_len, RSEQ_SIZE_LATEST);
Where is the rseq_len supposed to come from in userspace ? Should it be
that the code doing the registration uses sizeof(struct rseq), or offsetof(struct rseq, end),
or should it read the content of __rseq_abi.user_size ?
> + ret = put_user(active_size, &rseq->active_size);
> + if (ret)
> + return ret;
> +
> + /* Let other users know what userspace used to register. */
> + ret = put_user(rseq_len, &rseq->user_size);
> + if (ret)
> + return -EFAULT;
> +
> + /* Let other users know what size the kernel supports. */
I am not sure what those 3 __u32 fields (user_size, kernel_size, and active_size),
plus use of the rseq_len syscall parameter, accomplish which was not accomplished
by my __u16 user_size + kernel_size approach ? If anything, it seems to make support
of older kernels which do not support an extended rseq_len parameter more complex.
Thanks,
Mathieu
> + ret = put_user(RSEQ_SIZE_LATEST, &rseq->kernel_size);
> + if (ret)
> + return -EFAULT;
> +
> + current->rseq_size = active_size;
> + } else {
> + current->rseq_size = RSEQ_SIZE_VER0;
> + }
> +
> current->rseq = rseq;
> current->rseq_sig = sig;
> /*
> --
> 2.27.0
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
Powered by blists - more mailing lists