[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <94c23850-265c-5d71-f0a3-e02b5dbc050e@efficios.com>
Date: Thu, 28 Sep 2023 16:43:58 -0400
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Thomas Gleixner <tglx@...utronix.de>,
Peter Zijlstra <peterz@...radead.org>
Cc: linux-kernel@...r.kernel.org,
"Paul E . McKenney" <paulmck@...nel.org>,
Boqun Feng <boqun.feng@...il.com>,
"H . Peter Anvin" <hpa@...or.com>, Paul Turner <pjt@...gle.com>,
linux-api@...r.kernel.org, Christian Brauner <brauner@...nel.org>,
Florian Weimer <fw@...eb.enyo.de>, David.Laight@...LAB.COM,
carlos@...hat.com, Peter Oskolkov <posk@...k.io>,
Alexander Mikhalitsyn <alexander@...alicyn.com>,
Chris Kennelly <ckennelly@...gle.com>,
Ingo Molnar <mingo@...hat.com>,
Darren Hart <dvhart@...radead.org>,
Davidlohr Bueso <dave@...olabs.net>,
André Almeida <andrealmeid@...lia.com>,
libc-alpha@...rceware.org, Steven Rostedt <rostedt@...dmis.org>,
Jonathan Corbet <corbet@....net>,
Noah Goldstein <goldstein.w.n@...il.com>,
Daniel Colascione <dancol@...gle.com>, longman@...hat.com,
Florian Weimer <fweimer@...hat.com>
Subject: Re: [RFC PATCH v2 1/4] rseq: Add sched_state field to struct rseq
On 9/28/23 16:21, Thomas Gleixner wrote:
> On Mon, May 29 2023 at 15:14, Mathieu Desnoyers wrote:
>> +void __rseq_set_sched_state(struct task_struct *t, unsigned int state);
>> +
>> +static inline void rseq_set_sched_state(struct task_struct *t, unsigned int state)
>> +{
>> + if (t->rseq_sched_state)
>> + __rseq_set_sched_state(t, state);
>
> This is invoked on every context switch and writes over that state
> unconditionally even in the case that the state was already
> cleared. There are enough situations where tasks are scheduled out
> several times while being in the kernel.
Right, if this becomes more than a PoC, I'll make sure to keep track of
the current state within the task struct, and only update userspace on
state transition.
>
>> /* rseq_preempt() requires preemption to be disabled. */
>> static inline void rseq_preempt(struct task_struct *t)
>> {
>> __set_bit(RSEQ_EVENT_PREEMPT_BIT, &t->rseq_event_mask);
>> rseq_set_notify_resume(t);
>> + rseq_set_sched_state(t, 0);
>
> This code is already stupid to begin with. __set_bit() is cheap, but
> rseq_set_notify_resume() is not as it has a conditional and a locked
> instruction
What alternative would you recommend to set a per-thread state that has
the same effect as TIF_NOTIFY_RESUME ? Indeed all it really needs to
synchronize with is the thread owning the flags, but AFAIU having this
flag part of the TIF flags requires use of an atomic instruction to
synchronize updates against concurrent threads.
If we move this thread flag into a separate field of struct thread_info,
then we could turn this atomic set bit into a more lightweight store,
but then we'd need to check against an extra field on return to userspace.
And if we want to remove the conditional branch on the scheduler
fast-path, we could always load and test both the task struct's rseq
pointer and the thread_info "preempted" state on return to userspace.
The tradeoff there would be to add extra loads and conditional branches
on return to userspace to speed up the scheduler fast-path.
Is this what you have in mind or am I missing your point ?
> and now you add two more conditionals into the context
> switch path.
I'm open to suggestions on how to improve this if this goes beyond PoC
stage and we observe measurable benefits on the userspace side.
Thanks,
Mathieu
>
> Thanks,
>
> tglx
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
Powered by blists - more mailing lists