lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 23 May 2024 10:06:04 +0100
From: Luis Machado <luis.machado@....com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: mingo@...hat.com, juri.lelli@...hat.com, vincent.guittot@...aro.org,
 dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com,
 mgorman@...e.de, bristot@...hat.com, vschneid@...hat.com,
 linux-kernel@...r.kernel.org, kprateek.nayak@....com,
 wuyun.abel@...edance.com, tglx@...utronix.de, efault@....de, nd
 <nd@....com>, John Stultz <jstultz@...gle.com>, Hongyan.Xia2@....com
Subject: Re: [RFC][PATCH 08/10] sched/fair: Implement delayed dequeue

Peter,

On 5/23/24 09:45, Peter Zijlstra wrote:
> On Mon, Apr 29, 2024 at 03:33:04PM +0100, Luis Machado wrote:
> 
>> (2) m6.6-eevdf-complete: m6.6-stock plus this series.
>> (3) m6.6-eevdf-complete-no-delay-dequeue: (2) + NO_DELAY_DEQUEUE
> 
>> +------------+------------------------------------------------------+-----------+
>> |  cluster   |                         tag                          | perc_diff |
>> +------------+------------------------------------------------------+-----------+
>> |    CPU     |                   m6.6-stock                         |   0.0%    |
>> |  CPU-Big   |                   m6.6-stock                         |   0.0%    |
>> | CPU-Little |                   m6.6-stock                         |   0.0%    |
>> |  CPU-Mid   |                   m6.6-stock                         |   0.0%    |
>> |    GPU     |                   m6.6-stock                         |   0.0%    |
>> |   Total    |                   m6.6-stock                         |   0.0%    |
> 
>> |    CPU     |        m6.6-eevdf-complete-no-delay-dequeue          |  117.77%  |
>> |  CPU-Big   |        m6.6-eevdf-complete-no-delay-dequeue          |  113.79%  |
>> | CPU-Little |        m6.6-eevdf-complete-no-delay-dequeue          |  97.47%   |
>> |  CPU-Mid   |        m6.6-eevdf-complete-no-delay-dequeue          |  189.0%   |
>> |    GPU     |        m6.6-eevdf-complete-no-delay-dequeue          |  -6.74%   |
>> |   Total    |        m6.6-eevdf-complete-no-delay-dequeue          |  103.84%  |
> 
> This one is still flummoxing me. I've gone over the patch a few times on
> different days and I'm not seeing it. Without DELAY_DEQUEUE it should
> behave as before.
> 
> Let me try and split this patch up into smaller parts such that you can
> try and bisect this.
> 

Same situation on my end. I've been chasing this for some time and I don't fully
understand why things go off the rails energy-wise as soon as DELAY_DEQUEUE is
enabled, now that the load_avg accounting red herring is gone.

I do have one additional piece of information though. Hopefully it will be useful.

Booting the kernel with NO_DELAY_DEQUEUE (default to false), things work fine. Then
if I switch to DELAY_DEQUEUE at runtime, things start using a lot more power.

The interesting bit is if I switch to NO_DELAY_DEQUEUE again at runtime, things don't
go back to normal. Rather they stay the same, using a lot more energy.

I wonder if we're leaving some unbalanced state somewhere while DELAY_DEQUEUE is on,
something that is signalling we have more load/utilization than we actually do.

The PELT signals look reasonable from what I can see. We don't seem to be boosting
frequencies, but we're running things mostly on big cores with DELAY_DEQUEUE on.

I'll keep investigating this. Please let me know if you need some additional data or
testing and I can get that going.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ