[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240809073509.GK31338@noisy.programming.kicks-ass.net>
Date: Fri, 9 Aug 2024 09:35:09 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Chen Yu <yu.c.chen@...el.com>
Cc: mingo@...hat.com, juri.lelli@...hat.com, vincent.guittot@...aro.org,
	dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com,
	mgorman@...e.de, vschneid@...hat.com, linux-kernel@...r.kernel.org,
	kprateek.nayak@....com, wuyun.abel@...edance.com,
	youssefesmat@...omium.org, tglx@...utronix.de, efault@....de,
	Mike Galbraith <umgwanakikbuti@...il.com>
Subject: Re: [PATCH 21/24] sched/eevdf: Allow shorter slices to wakeup-preempt
On Thu, Aug 08, 2024 at 08:31:53PM +0800, Chen Yu wrote:
> On 2024-08-08 at 12:22:07 +0200, Peter Zijlstra wrote:
> > On Thu, Aug 08, 2024 at 06:15:50PM +0800, Chen Yu wrote:
> > > Hi Peter,
> > > 
> > > On 2024-07-27 at 12:27:53 +0200, Peter Zijlstra wrote:
> > > > Part of the reason to have shorter slices is to improve
> > > > responsiveness. Allow shorter slices to preempt longer slices on
> > > > wakeup.
> > > > 
> > > >     Task                  |   Runtime ms  | Switches | Avg delay ms    | Max delay ms    | Sum delay ms     |
> > > > 
> > > >   100ms massive_intr 500us cyclictest NO_PREEMPT_SHORT
> > > > 
> > > >   1 massive_intr:(5)      | 846018.956 ms |   779188 | avg:   0.273 ms | max:  58.337 ms | sum:212545.245 ms |
> > > >   2 massive_intr:(5)      | 853450.693 ms |   792269 | avg:   0.275 ms | max:  71.193 ms | sum:218263.588 ms |
> > > >   3 massive_intr:(5)      | 843888.920 ms |   771456 | avg:   0.277 ms | max:  92.405 ms | sum:213353.221 ms |
> > > >   1 chromium-browse:(8)   |  53015.889 ms |   131766 | avg:   0.463 ms | max:  36.341 ms | sum:60959.230  ms |
> > > >   2 chromium-browse:(8)   |  53864.088 ms |   136962 | avg:   0.480 ms | max:  27.091 ms | sum:65687.681  ms |
> > > >   3 chromium-browse:(9)   |  53637.904 ms |   132637 | avg:   0.481 ms | max:  24.756 ms | sum:63781.673  ms |
> > > >   1 cyclictest:(5)        |  12615.604 ms |   639689 | avg:   0.471 ms | max:  32.272 ms | sum:301351.094 ms |
> > > >   2 cyclictest:(5)        |  12511.583 ms |   642578 | avg:   0.448 ms | max:  44.243 ms | sum:287632.830 ms |
> > > >   3 cyclictest:(5)        |  12545.867 ms |   635953 | avg:   0.475 ms | max:  25.530 ms | sum:302374.658 ms |
> > > > 
> > > >   100ms massive_intr 500us cyclictest PREEMPT_SHORT
> > > > 
> > > >   1 massive_intr:(5)      | 839843.919 ms |   837384 | avg:   0.264 ms | max:  74.366 ms | sum:221476.885 ms |
> > > >   2 massive_intr:(5)      | 852449.913 ms |   845086 | avg:   0.252 ms | max:  68.162 ms | sum:212595.968 ms |
> > > >   3 massive_intr:(5)      | 839180.725 ms |   836883 | avg:   0.266 ms | max:  69.742 ms | sum:222812.038 ms |
> > > >   1 chromium-browse:(11)  |  54591.481 ms |   138388 | avg:   0.458 ms | max:  35.427 ms | sum:63401.508  ms |
> > > >   2 chromium-browse:(8)   |  52034.541 ms |   132276 | avg:   0.436 ms | max:  31.826 ms | sum:57732.958  ms |
> > > >   3 chromium-browse:(8)   |  55231.771 ms |   141892 | avg:   0.469 ms | max:  27.607 ms | sum:66538.697  ms |
> > > >   1 cyclictest:(5)        |  13156.391 ms |   667412 | avg:   0.373 ms | max:  38.247 ms | sum:249174.502 ms |
> > > >   2 cyclictest:(5)        |  12688.939 ms |   665144 | avg:   0.374 ms | max:  33.548 ms | sum:248509.392 ms |
> > > >   3 cyclictest:(5)        |  13475.623 ms |   669110 | avg:   0.370 ms | max:  37.819 ms | sum:247673.390 ms |
> > > > 
> > > > As per the numbers the, this makes cyclictest (short slice) it's
> > > > max-delay more consistent and consistency drops the sum-delay. The
> > > > trade-off is that the massive_intr (long slice) gets more context
> > > > switches and a slight increase in sum-delay.
> > > > 
> > > > [mike: numbers]
> > > > Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
> > > > Tested-by: Mike Galbraith <umgwanakikbuti@...il.com>
> > > 
> > > Besides this short preemption, it seems that an important patch is missing from
> > > this patch set, that was originally from Prateek and you refined it to fix the
> > > current task's false negative eligibility:
> > > https://lore.kernel.org/lkml/20240424150721.GQ30852@noisy.programming.kicks-ass.net/
> > > 
> > > The RESPECT_SLICE is introduced to honor the current task's slice during wakeup preemption.
> > > Without it we got reported that over-preemption and performance downgrading are observed
> > > when running SPECjbb on servers.
> > 
> > So I *think* that running as SCHED_BATCH gets you exactly that
> > behaviour, no?
> 
> SCHED_BATCH should work as it avoids the wakeup preemption as much as possible.
> Except that RESPECT_SLICE considers the cgroup hierarchical to check if the current
> sched_entity has used up its slice, which seems to be less aggressive.
Note that update_deadline() will trigger a resched at the end up a slice
regardless -- this is driven from update_curr() and also invoked from
any preemption.
Powered by blists - more mailing lists
 
