[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <62825712-36bc-483c-9bca-db3d9233b0d2@efficios.com>
Date: Thu, 23 May 2024 16:20:16 -0400
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: André Almeida <andrealmeid@...lia.com>,
Peter Zijlstra <peterz@...radead.org>
Cc: linux-kernel@...r.kernel.org, Thomas Gleixner <tglx@...utronix.de>,
"Paul E . McKenney" <paulmck@...nel.org>, Boqun Feng <boqun.feng@...il.com>,
"H . Peter Anvin" <hpa@...or.com>, Paul Turner <pjt@...gle.com>,
linux-api@...r.kernel.org, Christian Brauner <brauner@...nel.org>,
Florian Weimer <fw@...eb.enyo.de>, David.Laight@...LAB.COM,
carlos@...hat.com, Peter Oskolkov <posk@...k.io>,
Alexander Mikhalitsyn <alexander@...alicyn.com>,
Chris Kennelly <ckennelly@...gle.com>, Ingo Molnar <mingo@...hat.com>,
Darren Hart <dvhart@...radead.org>, Davidlohr Bueso <dave@...olabs.net>,
libc-alpha@...rceware.org, Steven Rostedt <rostedt@...dmis.org>,
Jonathan Corbet <corbet@....net>, Noah Goldstein <goldstein.w.n@...il.com>,
Daniel Colascione <dancol@...gle.com>, longman@...hat.com,
kernel-dev@...lia.com
Subject: Re: [PATCH v2 1/1] futex: Add FUTEX_SPIN operation
On 2024-05-23 16:07, André Almeida wrote:
> Add a new mode for futex wait, the futex spin.
>
> Given the FUTEX2_SPIN flag, parse the futex value as the TID of the lock
> owner. Then, before going to the normal wait path, spins while the lock
> owner is running in a different CPU, to avoid the whole context switch
> operation and to quickly return to userspace. If the lock owner is not
> running, just sleep as the normal futex wait path.
>
> The user value is masked with FUTEX_TID_MASK, to allow some bits for
> future use.
>
> The check for the owner to be running or not is important to avoid
> spinning for something that won't be released quickly. Userspace is
> responsible on providing the proper TID, the kernel does a basic check.
>
> Signed-off-by: André Almeida <andrealmeid@...lia.com>
> ---
[...]
> +
> +static int futex_spin(struct futex_hash_bucket *hb, struct futex_q *q,
> + struct hrtimer_sleeper *timeout, u32 uval)
> +{
> + struct task_struct *p;
> + pid_t pid = uval & FUTEX_TID_MASK;
> +
> + p = find_get_task_by_vpid(pid);
> +
> + /* no task found, maybe it already exited */
> + if (!p) {
> + futex_q_unlock(hb);
> + return -EAGAIN;
> + }
> +
> + /* can't spin in a kernel task */
> + if (unlikely(p->flags & PF_KTHREAD)) {
> + put_task_struct(p);
> + futex_q_unlock(hb);
> + return -EPERM;
> + }
> +
> + futex_queue(q, hb);
> +
> + if (timeout)
> + hrtimer_sleeper_start_expires(timeout, HRTIMER_MODE_ABS);
> +
> + while (1) {
Infinite loops in other kernel/futex/ files appear to use "for (;;) {"
instead.
> + if (likely(!plist_node_empty(&q->list))) {
> + if (timeout && !timeout->task)
> + goto exit;
> +
> + if (task_on_cpu(p)) {
> + /* spin */
You may want to add a "cpu_relax();" here to lessen the
power consumption of this busy-loop.
> + continue;
> + } else {
> + /* task is not running, sleep */
> + break;
> + }
> + } else {
> + goto exit;
> + }
> + }
> +
> + /* spinning didn't work, go to the normal path */
> + set_current_state(TASK_INTERRUPTIBLE|TASK_FREEZABLE);
I wonder if flipping the order between:
set_current_state(TASK_INTERRUPTIBLE|TASK_FREEZABLE);
futex_queue(q, hb);
as done in futex_wait_queue() and what is done here matters ?
Does it introduce a race where a wakeup could be missed ?
If it's an issue, then setting the current state could be done
before invoking futex_queue(), and whenever the spin exits,
set it back to TASK_RUNNING.
> +
> + if (likely(!plist_node_empty(&q->list))) {
> + if (!timeout || timeout->task)
> + schedule();
> + }
> +
> + __set_current_state(TASK_RUNNING);
> +
> +exit:
> + put_task_struct(p);
> + return 0;
> +}
> +
> /**
> * futex_unqueue_multiple - Remove various futexes from their hash bucket
> * @v: The list of futexes to unqueue
> @@ -665,8 +732,15 @@ int __futex_wait(u32 __user *uaddr, unsigned int flags, u32 val,
> if (ret)
> return ret;
>
> - /* futex_queue and wait for wakeup, timeout, or a signal. */
> - futex_wait_queue(hb, &q, to);
> + if (flags & FLAGS_SPIN) {
> + ret = futex_spin(hb, &q, to, val);
The empty line below could be removed.
Thanks,
Mathieu
> +
> + if (ret)
> + return ret;
> + } else {
> + /* futex_queue and wait for wakeup, timeout, or a signal. */
> + futex_wait_queue(hb, &q, to);
> + }
>
> /* If we were woken (and unqueued), we succeeded, whatever. */
> if (!futex_unqueue(&q))
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
Powered by blists - more mailing lists