[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230926205215.472650-1-dvyukov@google.com>
Date: Tue, 26 Sep 2023 22:52:15 +0200
From: Dmitry Vyukov <dvyukov@...gle.com>
To: mathieu.desnoyers@...icios.com
Cc: David.Laight@...LAB.COM, alexander@...alicyn.com,
andrealmeid@...lia.com, boqun.feng@...il.com, brauner@...nel.org,
carlos@...hat.com, ckennelly@...gle.com, corbet@....net,
dancol@...gle.com, dave@...olabs.net, dvhart@...radead.org,
fweimer@...hat.com, goldstein.w.n@...il.com, hpa@...or.com,
libc-alpha@...rceware.org, linux-api@...r.kernel.org,
linux-kernel@...r.kernel.org, longman@...hat.com, mingo@...hat.com,
paulmck@...nel.org, peterz@...radead.org, pjt@...gle.com,
posk@...k.io, rostedt@...dmis.org, tglx@...utronix.de
Subject: Re: [RFC PATCH v2 1/4] rseq: Add sched_state field to struct rseq
>> I don't see why we can't stick this directly into struct rseq because
>> it's all public anyway.
>
> The motivation for moving this to a different cache line is to handle
> the prior comment from Boqun, who is concerned that busy-waiting
> repeatedly loading a field from struct rseq will cause false-sharing and
> make other stores to that cache line slower, especially stores to
> rseq_cs to begin rseq critical sections, thus slightly increasing the
> overhead of rseq critical sections taken while mutexes are held.
>
> If we want to embed this field into struct rseq with its own cache line,
> then we need to add a lot of padding, which is inconvenient.
>
> That being said, perhaps this is premature optimization, what do you think ?
Hi Mathieu, Florian,
This is exciting!
I thought the motivation for moving rseq_sched_state out of struct rseq
is lifetime management problem. I assume when a thread locks a mutex,
it stores pointer to rseq_sched_state in the mutex state for other
threads to poll. So the waiting thread would do something along the following
lines:
rseq_sched_state* state = __atomic_load_n(mutex->sched_state, __ATOMIC_RELAXED);
if (state && !(state->state & RSEQ_SCHED_STATE_FLAG_ON_CPU))
futex_wait();
Now if the state is struct rseq, which is stored in TLS,
then the owning thread can unlock the mutex, exit and unmap TLS in between.
Consequently, load of state->state will cause a paging fault.
And we do want rseq in TLS to save 1 indirection.
If rseq_sched_state is separated from struct rseq, then it can be allocated
in type stable memory that is never unmapped.
What am I missing here?
However, if we can store this state in struct rseq, then an alternative
interface would for the kernel to do:
rseq->cpu_id = -1;
to denote that the thread is not running on any CPU.
I think it kinda makes sense, rseq->cpu_id is the thread's current CPU,
and -1 naturally means "not running at all". And we already store -1
right after init, so it shouldn't be a surprising value.
Powered by blists - more mailing lists