[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <000801db604b$e0f6b580$a2e42080$@telus.net>
Date: Mon, 6 Jan 2025 07:01:34 -0800
From: "Doug Smythies" <dsmythies@...us.net>
To: "'Peter Zijlstra'" <peterz@...radead.org>
Cc: <linux-kernel@...r.kernel.org>,
<vincent.guittot@...aro.org>,
"Doug Smythies" <dsmythies@...us.net>
Subject: RE: [REGRESSION] Re: [PATCH 00/24] Complete EEVDF
On 2025.01.06 03:58 Peter Zijlstra wrote:
>On Sun, Dec 29, 2024 at 02:51:43PM -0800, Doug Smythies wrote:
>> Hi Peter,
>>
>> I have been having trouble with turbostat reporting processor package power levels that can not possibly be true.
>> After eliminating the turbostat program itself as the source of the issue I bisected the kernel.
>> An edited summary (actual log attached):
>>
>> 82e9d0456e06 sched/fair: Avoid re-setting virtual deadline on 'migrations'
>> b10 bad fc1892becd56 sched/eevdf: Fixup PELT vs DELAYED_DEQUEUE
>> b13 bad 54a58a787791 sched/fair: Implement DELAY_ZERO
>> skip 152e11f6df29 sched/fair: Implement delayed dequeue
>> skip e1459a50ba31 sched: Teach dequeue_task() about special task states
>> skip a1c446611e31 sched,freezer: Mark TASK_FROZEN special
>> skip 781773e3b680 sched/fair: Implement ENQUEUE_DELAYED
>> skip f12e148892ed sched/fair: Prepare pick_next_task() for delayed dequeue
>> skip 2e0199df252a sched/fair: Prepare exit/cleanup paths for delayed_dequeue
>> b12 good e28b5f8bda01 sched/fair: Assert {set_next,put_prev}_entity() are properly balanced
>> dfa0a574cbc4 sched/uclamg: Handle delayed dequeue
>> b11 good abc158c82ae5 sched: Prepare generic code for delayed dequeue
>> e8901061ca0c sched: Split DEQUEUE_SLEEP from deactivate_task()
>>
>> Where "bN" is just my assigned kernel name for each bisection step.
>>
>> In the linux-kernel email archives I found a thread that isolated these same commits.
>> It was from late Novermebr / early December:
>>
>> https://lore.kernel.org/all/20240727105030.226163742@infradead.org/T/#m9aeb4d897e029cf7546513bb09499c320457c174
>>
>> An example of the turbostat manifestation of the issue:
>>
>> doug@s19:~$ sudo ~/kernel/linux/tools/power/x86/turbostat/turbostat --quiet --Summary --show
>> Busy%,Bzy_MHz,IRQ,PkgWatt,PkgTmp,TSC_MHz --interval 1
>> [sudo] password for doug:
>> Busy% Bzy_MHz TSC_MHz IRQ PkgTmp PkgWatt
>> 99.76 4800 4104 12304 73 80.08
>> 99.76 4800 4104 12047 73 80.23
>> 99.76 4800 879 12157 73 11.40
>> 99.76 4800 26667 84214 72 557.23
>> 99.76 4800 4104 12036 72 79.39
>>
>> Where TSC_MHz was reported as 879, there was a big gap in time.
>> Like 4.7 seconds instead of 1.
>> Where TSC_MHz was reported as 26667, there was not a big gap in time.
>>
>> It happens for about 5% of the samples + or - a lot.
>> It only happens when the workload is almost exactly 100%.
>> More load, it doesn't occur.
>> Less load, it doesn't occur. Although, I did get this once:
>>
>> Busy% Bzy_MHz TSC_MHz IRQ PkgTmp PkgWatt
>> 91.46 4800 4104 11348 73 103.98
>> 91.46 4800 4104 11353 73 103.89
>> 91.50 4800 3903 11339 73 98.16
>> 91.43 4800 4271 12001 73 108.52
>> 91.45 4800 4148 11481 73 105.13
>> 91.46 4800 4104 11341 73 103.96
>> 91.46 4800 4104 11348 73 103.99
>>
>> So, it might just be much less probable and less severe.
>>
>> It happens over many different types of workload that I have tried.
>
> In private email you've communicated it happens due to
> sched_setaffinity() sometimes taking multiple seconds.
>
> I'm trying to reproduce by starting a bash 'while ;: do :; done' spinner
> for each CPU, but so far am not able to reproduce.
I have also been trying to reproduce the issue without using turbostat.
No success.
The other thing to note is that my test computer is otherwise very
very idle with no GUI and few services.
>
> What is the easiest 100% load you're seeing this with?
Lately, and specifically to be able to tell others, I have been using:
yes > /dev/null &
On my Intel i5-10600K, with 6 cores and 2 threads per core, 12 CPUs,
I run 12 of those work loads.
... Doug
Powered by blists - more mailing lists