[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230511135303.GE83892@hirez.programming.kicks-ass.net>
Date: Thu, 11 May 2023 15:53:03 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Crystal Wood <swood@...hat.com>
Cc: Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
linux-kernel@...r.kernel.org, Ben Segall <bsegall@...gle.com>,
Boqun Feng <boqun.feng@...il.com>,
Daniel Bristot de Oliveira <bristot@...hat.com>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Ingo Molnar <mingo@...hat.com>,
John Stultz <jstultz@...gle.com>,
Juri Lelli <juri.lelli@...hat.com>,
Mel Gorman <mgorman@...e.de>,
Steven Rostedt <rostedt@...dmis.org>,
Thomas Gleixner <tglx@...utronix.de>,
Valentin Schneider <vschneid@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
Waiman Long <longman@...hat.com>, Will Deacon <will@...nel.org>
Subject: Re: [PATCH v2 1/4] sched/core: Provide sched_rtmutex() and expose
sched work helpers
On Tue, May 09, 2023 at 05:14:38PM -0500, Crystal Wood wrote:
> On Wed, 2023-05-03 at 15:20 +0200, Peter Zijlstra wrote:
> > Urgh, so I really don't like this.
> >
> > The end result is something like:
> >
> > rt_mutex_lock()
> > sched_submit_work();
> > // a nested rt_mutex_lock() here will not clobber
> > // ->pi_blocked_on because it's not set yet.
> >
> > task_blocks_on_rt_mutex();
> > tsk->pi_blocked_on = waiter;
> > rt_mutex_enqueue(lock, waiter); <-- the real problem
> >
> > rt_mutex_slowlock_block();
> > schedule_rtmutex();
> >
> > sched_resume_work();
> >
> > And all of this it not just because tsk->pi_blocked_on, but mostly
> > because of task_blocks_on_rt_mutex() enqueueing the waiter. The whole
> > enqueue thing is what makes the 'simple' solution of saving/restoring
> > tsk->pi_blocked_on not work.
> >
> > Basically the pi_blocked_on curruption is a side effect, not the
> > fundamental issue. One task having two waiters registered is the bigger
> > issue.
>
> Where do you see pi_blocked_on being saved/restored?
I'm not, I'm saying that *IF* ->pi_blocked_on corruption were the real
problem that would be a sufficient solution.
But ->pi_blocked_on corruption is just a wee side effect of the real
problem, which is that the waiter is already enqueued and we've already
done PI and can't very well back out of all that in a hurry.
> The whole point of
> this patchset is to deal with sched_submit_work() before anything has
> been done on the "outer" lock acquisition (not just pi_blocked_on, but
> also enqueuing) other than failing the fast path.
It's also terribly fragile, sprinkling stuff all over that shouldn't be
sprinkled.
And it's sprinkled far wider than it needs to be -- per the below
argument it really only should be applied to rtlock, not to rt_mutex or
ww_rt_mutex or any of the others that normally block and shouldn't be
used anyway.
> > Now, sched_submit_work() could also use (regular) mutex -- after all
> > it's a fully preemptible context. And then we're subject to the 'same'
> > problem but with tsk->blocked_on (DEBUG_MUTEXES=y).
>
> It's fully preemptible but it still shouldn't be doing things that would
> block on non-RT. That'd already be broken for a number of reasons (task
> state corruption, infinite recursion if current->plug isn't cleared
> before doing whatever causes another standard schedule(), etc).
task->state is fairly immune to corruption normally -- the typical
case is that the nested block resolves and resets it to RUNNING, at
which point the outer loop 'fails' to schedule() and re-tries. All that
is mostly harmless.
But yes, all that code *SHOULD* not block, but nothing is really
enforcing that.
Powered by blists - more mailing lists