[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250212121113.3nJ-kf-6@linutronix.de>
Date: Wed, 12 Feb 2025 13:11:13 +0100
From: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: Joel Fernandes <joel@...lfernandes.org>,
Prakash Sangappa <prakash.sangappa@...cle.com>,
Peter Zijlstra <peterz@...radead.org>, linux-kernel@...r.kernel.org,
linux-trace-kernel@...r.kernel.org,
Thomas Gleixner <tglx@...utronix.de>,
Ankur Arora <ankur.a.arora@...cle.com>,
Linus Torvalds <torvalds@...ux-foundation.org>, linux-mm@...ck.org,
x86@...nel.org, Andrew Morton <akpm@...ux-foundation.org>,
luto@...nel.org, bp@...en8.de, dave.hansen@...ux.intel.com,
hpa@...or.com, juri.lelli@...hat.com, vincent.guittot@...aro.org,
willy@...radead.org, mgorman@...e.de, jon.grimm@....com,
bharata@....com, raghavendra.kt@....com,
Boris Ostrovsky <boris.ostrovsky@...cle.com>,
Konrad Wilk <konrad.wilk@...cle.com>, jgross@...e.com,
Andrew.Cooper3@...rix.com, Vineeth Pillai <vineethrp@...gle.com>,
Suleiman Souhlal <suleiman@...gle.com>,
Ingo Molnar <mingo@...nel.org>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
Clark Williams <clark.williams@...il.com>, daniel.wagner@...e.com,
Joseph Salisbury <joseph.salisbury@...cle.com>, broonie@...il.com
Subject: Re: [RFC][PATCH 1/2] sched: Extended scheduler time slice
On 2025-02-11 10:28:01 [-0500], Steven Rostedt wrote:
> On Tue, 11 Feb 2025 09:21:38 +0100
> Sebastian Andrzej Siewior <bigeasy@...utronix.de> wrote:
>
> > We don't follow this behaviour exactly today.
> >
> > Adding this behaviour back vs the behaviour we have now, doesn't seem to
> > improve anything at visible levels. We don't have a counter but we can
> > look at the RCU nesting counter which should be zero once locks have
> > been dropped. So this can be used for testing.
> >
> > But as I said: using "run to completion" and preempt on the return
> > userland rather than once the lazy flag is seen and all locks have been
> > released appears to be better.
> >
> > It is (now) possible that you run for a long time and get preempted
> > while holding a spinlock_t. It is however more likely that you release
> > all locks and get preempted while returning to userland.
>
> IIUC, today, LAZY causes all SCHED_OTHER tasks to act more like
> PREEMPT_NONE. Is that correct?
Well. First sched-tick will set the LAZY bit, the second sched-tick
forces a resched.
On PREEMPT_NONE the sched-tick would be set NEED_RESCHED while nothing
will force a resched until the task decides to do schedule() on its own.
So it is slightly different for kernel threads.
Unless we talk about userland, here we would have a resched on the
return to userland after the sched-tick LAZY or NONE does not matter.
> Now that the PREEMPT_RT is not one of the preemption selections, when you
> select PREEMPT_RT, you can pick between CONFIG_PREEMPT and
> CONFIG_PREEMPT_LAZY. Where CONFIG_PREEMPT will preempt the kernel at the
> scheduler tick if preemption is enabled and CONFIG_PREEMPT_LAZY will
> not preempt the kernel on a scheduler tick and wait for exit to user space.
This is not specific to RT but FULL vs LAZY. But yes. However the second
sched-tick will force preemption point even without the
exit-to-userland.
> Sebastian,
>
> It appears you only tested the CONFIG_PREEMPT_LAZY selection. Have you
> tested the difference of how CONFIG_PREEMPT behaves between PREEMPT_RT and
> no PREEMPT_RT? I think that will show a difference like we had in the past.
Not that I remember testing PREEMPT vs PREEMPT_RT. I remember people
complained about high networking load on RT which become visible due to
threaded interrupts (as in top) while for non-RT it was more or less
hidden and not clearly visible due to selected accounting. The network
performance was mostly the same as far as I remember (that is gbit).
> I can see people picking both PREEMPT_RT and CONFIG_PREEMPT (Full), but
> then wondering why their non RT tasks are suffering from a performance
> penalty from that.
We might want to opt in for lazy by default on RT. That was the case in
the RT queue until it was replaced with PREEMPT_AUTO.
But then why not use LAZY in favour of PREEMPT. Mike had numbers
https://lore.kernel.org/all/9df22ebbc2e6d426099bf380477a0ed885068896.camel@gmx.de/
where LAZY had mostly the voluntary performance with less context
switches than preempt. Which means also without the need for
cond_resched() and friends.
> -- Steve
Sebastian
Powered by blists - more mailing lists