[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20231208214733.GA3448992@google.com>
Date: Fri, 8 Dec 2023 21:47:33 +0000
From: Joel Fernandes <joel@...lfernandes.org>
To: Daniel Bristot de Oliveira <bristot@...nel.org>,
Peter Zijlstra <peterz@...radead.org>
Cc: Ingo Molnar <mingo@...hat.com>, Juri Lelli <juri.lelli@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>,
Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
Daniel Bristot de Oliveira <bristot@...hat.com>,
Valentin Schneider <vschneid@...hat.com>,
linux-kernel@...r.kernel.org,
Luca Abeni <luca.abeni@...tannapisa.it>,
Tommaso Cucinotta <tommaso.cucinotta@...tannapisa.it>,
Thomas Gleixner <tglx@...utronix.de>,
Vineeth Pillai <vineeth@...byteword.org>,
Shuah Khan <skhan@...uxfoundation.org>,
Phil Auld <pauld@...hat.com>
Subject: Re: [PATCH v5 0/7] SCHED_DEADLINE server infrastructure
On Sat, Nov 04, 2023 at 11:59:17AM +0100, Daniel Bristot de Oliveira wrote:
> This is v5 of Peter's SCHED_DEADLINE server infrastructure
> implementation [1].
>
> SCHED_DEADLINE servers can help fixing starvation issues of low priority
> tasks (e.g., SCHED_OTHER) when higher priority tasks monopolize CPU
> cycles. Today we have RT Throttling; DEADLINE servers should be able to
> replace and improve that.
Hello!
Just wanted to provide some ChromeOS data on these patches. There is
great improvement when using DL-sever along with RT for foreground Chrome's
display, audio and main threads. Once they are in the background, we set them
back to CFS (except audio). I think these patches are ready to move forward
as the data looks good to me. I see Peter picked up some of them already
which is nice.
One of the key metrics for us is event latency. We have a test that measures
various latency metrics with typing happening on a Google docs in one window
and a 16-person Google meet call happening on the other. This is a very
complex test but gets us close to what the user experiences (as is typical -
meeting attendees in a Google meet call take notes in a Google doc). As a
result, getting stable numbers requires a lot of care which is why I used
P-value to measure the statistical significance of the results. The P-value
for some metrics show lower significance, so we can ignore those but I still
provided it in the table.
The test is run on a Chromebook with 4 cores (Intel(R) Celeron(R) N4100 CPU @
1.10GHz) and 16GB of RAM. No Hyperthreading.
All units are microseconds. The average is calculated as the average of 20
runs with and without "Chrome using RT + DL-server". The 5% every 1 second
default does not work for us, so I changed the DL server parameters to 5ms
every 30ms. This allows CFS to run more often.
This test runs for 6 hours. Total test time for both before and after is 12 hours:
---------------------------------------------------------------------------------------------------------
| MetricName | Average Before | Average After | Change % | P-value |
---------------------------------------------------------------------------------------------------------
| Ash.EventLatency.Core.TotalLatency | 90.19 | 78.22 | 13.27% | 0.03 |
---------------------------------------------------------------------------------------------------------
| Ash.EventLatency.KeyReleased.TotalLatency | 90733.76 | 78602.72 | 13.37% | 0.03 |
---------------------------------------------------------------------------------------------------------
| Ash.EventLatency.TotalLatency | 90.19 | 78.22 | 13.27% | 0.03 |
---------------------------------------------------------------------------------------------------------
| Docs.EventLatency.KeyPressed.TotalLatency | 68269.21 | 63310.99 | 7.26% | 0.00 |
---------------------------------------------------------------------------------------------------------
| Docs.EventLatency.MousePressed.TotalLatency | 192080.44 | 179264.31 | 6.67% | 0.26 |
---------------------------------------------------------------------------------------------------------
| Docs.EventLatency.TotalLatency | 68795.99 | 63860.04 | 7.17% | 0.00 |
---------------------------------------------------------------------------------------------------------
| EventLatency.GestureScrollUpdt.Wheel.TotalLat | 63420.88 | 59394.18 | 6.35% | 0.02 |
---------------------------------------------------------------------------------------------------------
| EventLatency.KeyPressed.TotalLatency | 68269.21 | 63310.99 | 7.26% | 0.00 |
---------------------------------------------------------------------------------------------------------
| EventLatency.MouseDragged.TotalLatency | 106393.09 | 104152.50 | 2.11% | 0.57 |
---------------------------------------------------------------------------------------------------------
| EventLatency.MouseMoved.TotalLatency | 129225.65 | 113268.48 | 12.35% | 0.01 |
---------------------------------------------------------------------------------------------------------
| EventLatency.MousePressed.TotalLatency | 192080.44 | 179264.31 | 6.67% | 0.26 |
---------------------------------------------------------------------------------------------------------
| EventLatency.MouseReleased.TotalLatency | 152366.33 | 140309.50 | 7.91% | 0.44 |
---------------------------------------------------------------------------------------------------------
| EventLatency.TotalLatency | 68795.99 | 63862.45 | 7.17% | 0.00 |
---------------------------------------------------------------------------------------------------------
| EventLatency.TotalLatency_ash-Chrome | 68795.99 | 63862.45 | 7.17% | 0.00 |
---------------------------------------------------------------------------------------------------------
I also did another test where I measure the CFS maximum latency (using perf
sched) while a YouTube video is playing, and the CFS max latency looks great
too. In fact, with the vanilla RT throttling, our CFS tasks are doing really
badly (perhaps because of depending on RT tasks due to locks or such). So we
definitely need the DL-server to use RT properly!
We are testing dlserver with 5ms/50ms and 5ms/100ms as well to see the
impact. But at the moment, 5ms/30ms is looking good.
Thanks for all of your work, here's to better Linux and better Chromebooks ;)
- Joel
Powered by blists - more mailing lists