[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250502090529.GU4198@noisy.programming.kicks-ass.net>
Date: Fri, 2 May 2025 11:05:29 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Prakash Sangappa <prakash.sangappa@...cle.com>
Cc: linux-kernel@...r.kernel.org, rostedt@...dmis.org,
mathieu.desnoyers@...icios.com, tglx@...utronix.de,
bigeasy@...utronix.de, kprateek.nayak@....com
Subject: Re: [PATCH V3 1/4] Sched: Scheduler time slice extension
On Fri, May 02, 2025 at 01:59:52AM +0000, Prakash Sangappa wrote:
> Add support for a thread to request extending its execution time slice on
> the cpu. The extra cpu time granted would help in allowing the thread to
> complete executing the critical section and drop any locks without getting
> preempted. The thread would request this cpu time extension, by setting a
> bit in the restartable sequences(rseq) structure registered with the kernel.
>
> Kernel will grant a 50us extension on the cpu, when it sees the bit set.
> With the help of a timer, kernel force preempts the thread if it is still
> running on the cpu when the 50us timer expires. The thread should yield
> the cpu by making a system call after completing the critical section.
>
> Suggested-by: Peter Ziljstra <peterz@...radead.org>
> Signed-off-by: Prakash Sangappa <prakash.sangappa@...cle.com>
> ---
> diff --git a/include/uapi/linux/rseq.h b/include/uapi/linux/rseq.h
> index c233aae5eac9..900cb75f6a88 100644
> --- a/include/uapi/linux/rseq.h
> +++ b/include/uapi/linux/rseq.h
> @@ -26,6 +26,7 @@ enum rseq_cs_flags_bit {
> RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT_BIT = 0,
> RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL_BIT = 1,
> RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE_BIT = 2,
> + RSEQ_CS_FLAG_DELAY_RESCHED_BIT = 3,
> };
>
> enum rseq_cs_flags {
> @@ -35,6 +36,8 @@ enum rseq_cs_flags {
> (1U << RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL_BIT),
> RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE =
> (1U << RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE_BIT),
> + RSEQ_CS_FLAG_DELAY_RESCHED =
> + (1U << RSEQ_CS_FLAG_DELAY_RESCHED_BIT),
> };
>
> /*
> @@ -128,6 +131,10 @@ struct rseq {
> * - RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE
> * Inhibit instruction sequence block restart on migration for
> * this thread.
> + * - RSEQ_CS_DELAY_RESCHED
> + * Request by user task to try delaying preemption. With
> + * use of a timer, extra cpu time upto 50us is granted for this
> + * thread before being rescheduled.
> */
> __u32 flags;
So while it works for testing, this really is a rather crap interface
for real because userspace cannot up front tell if its going to work or
not.
> +void rseq_delay_resched_fini(void)
> +{
> +#ifdef CONFIG_SCHED_HRTICK
> + extern void hrtick_local_start(u64 delay);
> + struct task_struct *t = current;
> + /*
> + * IRQs off, guaranteed to return to userspace, start timer on this CPU
> + * to limit the resched-overdraft.
> + *
> + * If your critical section is longer than 50 us you get to keep the
> + * pieces.
> + */
> + if (t->sched_time_delay)
> + hrtick_local_start(50 * NSEC_PER_USEC);
> +#endif
> +}
Should not the interface at least reflect this SCHED_HRTICK status? One,
slightly hacky way of doing this might to be invert the bit. Have the
system write a 1 when the feature is present, and have userspace flip it
to 0 to activate.
A better way might be to add a second bit.
Also, didn't we all agree 50us was overly optimistic and this number
should be lower?
> diff --git a/kernel/sched/syscalls.c b/kernel/sched/syscalls.c
> index cd38f4e9899d..1b2b64fe0fb1 100644
> --- a/kernel/sched/syscalls.c
> +++ b/kernel/sched/syscalls.c
> @@ -1378,6 +1378,11 @@ static void do_sched_yield(void)
> */
> SYSCALL_DEFINE0(sched_yield)
> {
> + if (IS_ENABLED(CONFIG_RSEQ) && current->sched_time_delay) {
> + schedule();
> + return 0;
> + }
> +
> do_sched_yield();
> return 0;
> }
Multiple people, very much including Linus, have already said this
'cute' hack isn't going to fly. Why is it still here?
Powered by blists - more mailing lists