linux-kernel - RE: [REGRESSION] Re: [PATCH 00/24] Complete EEVDF

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <007d01db652c$819f78c0$84de6a40$@telus.net>
Date: Sun, 12 Jan 2025 11:59:35 -0800
From: "Doug Smythies" <dsmythies@...us.net>
To: "'Peter Zijlstra'" <peterz@...radead.org>
Cc: <linux-kernel@...r.kernel.org>,
	<vincent.guittot@...aro.org>,
	"'Ingo Molnar'" <mingo@...nel.org>,
	<wuyun.abel@...edance.com>,
	"Doug Smythies" <dsmythies@...us.net>
Subject: RE: [REGRESSION] Re: [PATCH 00/24] Complete EEVDF

Hi Peter,

While we have moved on from this branch of this email thread,
just for completeness, I'll give the additional data from the overnight
test. There is also an observation that will be made and continued
in the next email.
 
On 2025.01.09 21:09 Doug Smythies wrote:
> On 2025.01.09 03:00 Peter Zijlstra wrote:
>
> ...
>
>> This made me have a very hard look at reweight_entity(), and
>> specifically the ->on_rq case, which is more prominent with
>> DELAY_DEQUEUE.
>>
>> And indeed, it is all sorts of broken. While the computation of the new
>> lag is correct, the computation for the new vruntime, using the new lag
>> is broken for it does not consider the logic set out in place_entity().
>>
>> With the below patch, I now see things like:
>>    
>>    migration/12-55      [012] d..3.   309.006650: reweight_entity: (ffff8881e0e6f600-ffff88885f235f40-12)
>>                               { weight: 977582 avg_vruntime: 4860513347366 vruntime: 4860513347908 (-542) deadline: 4860516552475
> } ->
>>                               { weight: 2 avg_vruntime: 4860528915984 vruntime: 4860793840706 (-264924722) deadline:
6427157349203
> }
>>    migration/14-62      [014] d..3.   309.006698: reweight_entity: (ffff8881e0e6cc00-ffff88885f3b5f40-15)
>>                               { weight: 2 avg_vruntime: 4874472992283 vruntime: 4939833828823 (-65360836540) deadline:
> 6316614641111 } ->
>>                               { weight: 967149 avg_vruntime: 4874217684324 vruntime: 4874217688559 (-4235) deadline:
4874220535650
> }
>> 
>> Which isn't perfect yet, but much closer.
>
> Agreed.
> I tested the patch. Attached is a repeat of a graph I had sent before, with different y axis scale and old data deleted.
> It still compares to the "b12" kernel (the last good one in the kernel bisection).
> It was a 2 hour and 31 minute duration test, and the maximum CPU migration time was 24 milliseconds,
> verses 6 seconds without the patch.
>
> I left things running for many hours and will let it continue overnight.
> There seems to have been an issue at one spot in time:
>
> usec    Time_Of_Day_Seconds     CPU     Busy%   IRQ
> 488994  1736476550.732222       -       99.76   12889
> 488520  1736476550.732222       11      99.76   1012
> 960999  1736476552.694222       -       99.76   17922
> 960587  1736476552.694222       11      99.76   1493
> 914999  1736476554.610222       -       99.76   23579
> 914597  1736476554.610222       11      99.76   1962
> 809999  1736476556.421222       -       99.76   23134
> 809598  1736476556.421222       11      99.76   1917
> 770998  1736476558.193221       -       99.76   21757
> 770603  1736476558.193221       11      99.76   1811
> 726999  1736476559.921222       -       99.76   21294
> 726600  1736476559.921222       11      99.76   1772
> 686998  1736476561.609221       -       99.76   20801
> 686600  1736476561.609221       11      99.76   1731
> 650998  1736476563.261221       -       99.76   20280
> 650601  1736476563.261221       11      99.76   1688
> 610998  1736476564.873221       -       99.76   19857
> 610606  1736476564.873221       11      99.76   1653

The test was continued overnight yielding this additional information:

868008  1736496040.956236       -       99.76   12668
867542  1736496040.956222       5       99.76   1046
5950010 1736496047.907233       -       99.76   22459
5949592 1736496047.907222       5       99.76   1871
5791008 1736496054.699232       -       99.76   83625
5790605 1736496054.699222       5       99.76   6957

1962999 1736502192.036227       -       99.76   12896
1962528 1736502192.036227       11      99.76   1030
434858  1736502193.472086       -       99.76   35824
434387  1736502193.472086       11      99.76   2965

Or 2 more continuous bursts, and a 5.9 second sample.

Observation: There isn't any 10's of milliseconds data.
Based on the graph, which is basically the same test
done in ever so slightly a different way, there should be
a lot of such data.

Rather than re-attach the same graph, I'll present the
Same data as a histogram:

First the b12 kernel (the last good one in the kernel bisection):

Time          Occurrences
1.000000, 3282
1.001000, 1826
1.002000, 227
1.003000, 1852
1.004000, 1036
1.005000, 731
1.006000, 75
1.007000, 30
1.008000, 9
1.009000, 2
1.010000, 1
1.011000, 1

Total: 9072 : Total >= 10 mSec: 2 ( 0.02 percent)

Second Kernel 6.13-rc6+this one patch

Time          Occurrences
1.000000, 1274
1.001000, 1474
1.002000, 512
1.003000, 3201
1.004000, 849
1.005000, 593
1.006000, 246
1.007000, 104
1.008000, 36
1.009000, 15
1.010000, 19
1.011000, 16
1.012000, 11
1.013000, 27
1.014000, 26
1.015000, 35
1.016000, 105
1.017000, 85
1.018000, 135
1.019000, 283
1.020000, 17
1.021000, 4
1.022000, 3
1.023000, 1

Total: 9071 : Total >= 10 mSec: 767 ( 8.46 percent) 

Where, and for example, this line:

1.005000, 593

Means that there were 593 occurrences of turbostat interval times
between 1.005 and 1.005999 seconds.

So, I would expect to see that reflected in the overnight test, but don't.
It would appear that the slightly different way of doing the test
effects the probability of occurrence significantly.

I'll continue in a reply to your patch 2 email from Friday (Jan 10th). 

>
> I had one of these the other day also, but they were all 6 seconds.
> Its like a burst of problematic data. I have the data somewhere,
> and can try to find it tomorrow.
>>
>> Fixes: eab03c23c2a1 ("sched/eevdf: Fix vruntime adjustment on reweight")
>> Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
>
> ...