[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230306191926.n5526srkze5fnqag@airbuntu>
Date: Mon, 6 Mar 2023 19:19:26 +0000
From: Qais Yousef <qyousef@...alina.io>
To: Joel Fernandes <joel@...lfernandes.org>
Cc: Steven Rostedt <rostedt@...dmis.org>,
John Stultz <jstultz@...gle.com>,
LKML <linux-kernel@...r.kernel.org>, Wei Wang <wvw@...gle.com>,
Midas Chien <midaschieh@...gle.com>,
Kees Cook <keescook@...omium.org>,
Anton Vorontsov <anton@...msg.org>,
"Guilherme G. Piccoli" <gpiccoli@...lia.com>,
Tony Luck <tony.luck@...el.com>, kernel-team@...roid.com,
Thomas Gleixner <tglx@...utronix.de>,
Peter Zijlstra <peterz@...radead.org>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Subject: Re: [PATCH] pstore: Revert pmsg_lock back to a normal mutex
On 03/04/23 03:21, Joel Fernandes wrote:
> On Fri, Mar 03, 2023 at 08:36:45PM +0000, Qais Yousef wrote:
> > On 03/03/23 14:38, Steven Rostedt wrote:
> > > On Fri, 3 Mar 2023 14:25:23 -0500
> > > Joel Fernandes <joel@...lfernandes.org> wrote:
> > >
> > > > On Fri, Mar 3, 2023 at 1:37 PM Steven Rostedt <rostedt@...dmis.org> wrote:
> > > > >
> > > > > On Fri, 3 Mar 2023 18:11:34 +0000
> > > > > Joel Fernandes <joel@...lfernandes.org> wrote:
> > > > >
> > > > > > In the normal mutex's adaptive spinning, there is no check for if there is a
> > > > > > change in waiter AFAICS (ignoring ww mutex stuff for a second).
> > > > > >
> > > > > > I can see one may want to do that waiter-check, as spinning
> > > > > > indefinitely if the lock owner is on the CPU for too long may result in
> > > > > > excessing power burn. But normal mutex does not seem to do that.
> > > > > >
> > > > > > What makes the rtmutex spin logic different from normal mutex in this
> > > > > > scenario, so that rtmutex wants to do that but normal ones dont?
> > > > >
> > > > > Well, the point of the patch is that I don't think they should be different
> > > > > ;-)
> > > >
> > > > But there's no "waiter change" thing for mutex_spin_on_owner right.
> > > >
> > > > Then, should mutex_spin_on_owner() also add a call to
> > > > __mutex_waiter_is_first() ?
> > >
> > > Ah interesting, I missed the __mutex_waiter_is_first() in the mutex code,
> > > where it looks to do basically the same thing as rt_mutex (but slightly
> > > different). From looking at this, it appears that mutex() has FIFO fair
> > > logic, where the second waiter will sleep.
> > >
> > > Would be interesting to see why John sees such a huge difference between
> > > normal mutex and rtmutex if they are doing the same thing. One thing is
> > > perhaps the priority logic is causing the issue, where this will not
> > > improve anything.
> >
> > I think that can be a good suspect. If the waiters are RT tasks the root cause
> > might be starvation issue due to bad priority setup and moving to FIFO just
> > happens to hide it.
>
> I wonder if mutex should actually prioritize giving the lock to RT tasks
> instead of FIFO, since that's higher priority work. It sounds that's more
> 'fair'. But that's likely to make John's issue worse.
It is the right thing to do IMHO, but I guess the implications are just too
hard to tell to enforce it by default yet. Which is I guess why it's all
protected by PREEMPT_RT still.
(I'm not sure but I assumed that logically PREEMPT_RT would convert all mutex
to rt_mutexes by default)
>
> > For same priority RT tasks, we should behave as FIFO too AFAICS.
> >
> > If there are a mix of RT vs CFS; RT will always win of course.
> >
> > >
> > > I wonder if we add spinning to normal mutex for the other waiters if that
> > > would improve things or make them worse?
> >
> > I see a potential risk depending on how long the worst case scenario for this
> > optimistic spinning.
> >
> > RT tasks can prevent all lower priority RT and CFS from running.
>
> Agree, I was kind of hoping need_resched() in mutex_spin_on_owner() would
> come to the rescue in such a scenario, but obviously not. Modifications to
> check_preempt_curr_rt() could obviously aid there but...
>
> > CFS tasks will lose some precious bandwidth from their sched_slice() as this
> > will be accounted for them as RUNNING time even if they were effectively
> > waiting.
>
> True, but maybe the CFS task is happy to lose some bandwidth and get back to
> CPU quickly, than blocking and not getting any work done. ;-)
It depends on the worst case scenario of spinning. If we can ensure it's
bounded to something small, then yeah I don't see an issue :-)
Cheers
--
Qais Yousef
Powered by blists - more mailing lists