[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEXW_YThrUgbbmje_1hRtWzNC5SozirDwhpccZiV=Trhe7HiHw@mail.gmail.com>
Date: Mon, 10 Feb 2025 09:07:27 -0500
From: Joel Fernandes <joel@...lfernandes.org>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: Prakash Sangappa <prakash.sangappa@...cle.com>, Peter Zijlstra <peterz@...radead.org>,
linux-kernel@...r.kernel.org, linux-trace-kernel@...r.kernel.org,
Thomas Gleixner <tglx@...utronix.de>, Ankur Arora <ankur.a.arora@...cle.com>,
Linus Torvalds <torvalds@...ux-foundation.org>, linux-mm@...ck.org, x86@...nel.org,
Andrew Morton <akpm@...ux-foundation.org>, luto@...nel.org, bp@...en8.de,
dave.hansen@...ux.intel.com, hpa@...or.com, juri.lelli@...hat.com,
vincent.guittot@...aro.org, willy@...radead.org, mgorman@...e.de,
jon.grimm@....com, bharata@....com, raghavendra.kt@....com,
Boris Ostrovsky <boris.ostrovsky@...cle.com>, Konrad Wilk <konrad.wilk@...cle.com>, jgross@...e.com,
Andrew.Cooper3@...rix.com, Vineeth Pillai <vineethrp@...gle.com>,
Suleiman Souhlal <suleiman@...gle.com>, Ingo Molnar <mingo@...nel.org>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>, Clark Williams <clark.williams@...il.com>,
bigeasy@...utronix.de, daniel.wagner@...e.com,
Joseph Salisbury <joseph.salisbury@...cle.com>, broonie@...il.com
Subject: Re: [RFC][PATCH 1/2] sched: Extended scheduler time slice
On Thu, Feb 6, 2025 at 8:30 AM Steven Rostedt <rostedt@...dmis.org> wrote:
>
> On Wed, 5 Feb 2025 22:07:12 -0500
> Joel Fernandes <joel@...lfernandes.org> wrote:
> > >
> > > RT tasks don't have a time slice. They are affected by events. An external
> > > interrupt coming in, or a timer going off that states something is
> > > happening. Perhaps we could use this for SCHED_RR or maybe even
> > > SCHED_DEADLINE, as those do have time slices.
> > >
> > > But if it does get used, it should only be used when the task being
> > > scheduled is the same SCHED_RR priority, or if SCHED_DEADLINE will not fail
> > > its guarantees.
> > >
> >
> > Right, it would apply still to RR/DL though...
>
> But it would have to guarantee that the RR it is delaying is of the same
> priority, and that delaying the DL is not going to cause something to miss
> its deadline.
See Peter comment: "Then pick another number; RT too has a max
scheduling latency number (on some random hardware). If you stay below
that, all is fine.".
> > 3. Overloading the purpose of LAZY: My understanding is, the purpose
> > of LAZY is to let the scheduler decide if it wants to preempt based on
> > preemption mode. It is not based on any hint, just on the preemption
> > mode. I guess you are overloading LAZY by making LAZY flag also extend
> > userspace timeslice (versus say making the time-slice extension hint
> > its own thing...).
>
> I already replied about that. Note, LAZY was created in PREEMPT_RT for this
> very purpose (but in the kernel), and ported to vanilla for a slightly
> different purpose.
>
> Here's the history:
>
> PREEMPT_RT would convert spin_locks in the kernel to sleeping mutexes.
>
> This made RT tasks respond much faster to events.
>
> But non-RT (SCHED_OTHER) started suffering performance issues.
>
> When looking at the performance issues, we found that it was due to tasks
> holding these sleeping spin_locks and being preempted. That is, the
> preemption of holding spin_locks was causing more contention and slowing
> things down tremendously.
>
> To first handle this, adaptive mutexes was introduced. These would spin
> if the owner of the lock was still running, and would go to sleep if the
> owner goes to sleep. This helped things quite a bit, but PREEMPT_RT was
> still suffer a performance deficit compared to non-RT.
>
> This was because of the timer tick on SCHED_OTHER tasks that could
> preempt a task holding a spin lock.
>
> NEED_RESCHED_LAZY was introduced to remedy this. It would be set for
> SCHED_OTHER tasks and NEED_RESCHED for RT tasks. If the task was holding
> a sleeping spin lock, the NEED_RESCHED_LAZY would not preempt the running
> task, but NEED_RESCHED would. If the SCHED_OTHER task was not holding a
> sleeping spin_lock it would be preempted regardless.
>
> This improved the performance of SCHED_OTHER tasks in PREEMPT_RT to be as
> good as what was in vanilla.
>
> You see, LAZY was *created* for this purpose. Of letting the scheduler know
> that the running task is in a critical section and the timer tick should
> not preempt a SCHED_OTHER task.
> I just wanted to extend this to SCHED_OTHER in user space too.
Currently it does not "let anyone know" it is running in a critical
section though. Various paths (update_curr(), wake up) just do a
'lazy' resched until the timer tick has elapsed, or the task returns
to usermode/idle at which point schedule() is called. And it does this
only for FAIR tasks. That can well happen even if the currently
running task is not in a critical section in the kernel at all. Sure,
it may benefit critical sections in the upstream kernel but where is
that explicit? I still feel we should not overload this in-kernel
mechanism for userspace locking and complicate things.
> > Yes, I have worked on RT projects before -- you would know better
> > than anyone. :-D. But admittedly, I haven't got to work much with
> > PREEMPT_RT systems.
>
> Just using RT policy to improve performance is not an RT project. I'm
> talking about projects that if you miss a deadline things crash. Where the
> project works very hard to make sure everything works as intended.
No no no, I have done way more than applying just the RT policy. So
that means you do not know me that well;-).. I have worked on audio
driver latency, low latency audio, latency issues in vmalloc code,
preempt tracers, irq tracepoints , wake up latency tracers and various
scheduler overhead debug — many of those issues dealt with sub
millisecond latency.. I also dealt with cpu idle issues in the
hardware causing real time latency problems (see my past talks if
interested). I was partly a hardware engineer when I started my
career and have built circuits. I have Electronics and Computer
engineering degrees.
- Joel
Powered by blists - more mailing lists