lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sun, 14 Jan 2024 14:03:14 +0100
From: Vincent Guittot <vincent.guittot@...aro.org>
To: Wyes Karny <wkarny@...il.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>, Qais Yousef <qyousef@...alina.io>, 
	Ingo Molnar <mingo@...nel.org>, linux-kernel@...r.kernel.org, 
	Peter Zijlstra <peterz@...radead.org>, Thomas Gleixner <tglx@...utronix.de>, 
	Juri Lelli <juri.lelli@...hat.com>, Dietmar Eggemann <dietmar.eggemann@....com>, 
	Steven Rostedt <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>, 
	Daniel Bristot de Oliveira <bristot@...hat.com>, Valentin Schneider <vschneid@...hat.com>
Subject: Re: [GIT PULL] Scheduler changes for v6.8

On Sun, 14 Jan 2024 at 13:38, Wyes Karny <wkarny@...il.com> wrote:
>
> On Sun, Jan 14, 2024 at 12:18:06PM +0100, Vincent Guittot wrote:
> > Hi Wyes,
> >
> > Le dimanche 14 janv. 2024 à 14:42:40 (+0530), Wyes Karny a écrit :
> > > On Wed, Jan 10, 2024 at 02:57:14PM -0800, Linus Torvalds wrote:
> > > > On Wed, 10 Jan 2024 at 14:41, Linus Torvalds
> > > > <torvalds@...ux-foundation.org> wrote:
> > > > >
> > > > > It's one of these two:
> > > > >
> > > > >   f12560779f9d sched/cpufreq: Rework iowait boost
> > > > >   9c0b4bb7f630 sched/cpufreq: Rework schedutil governor performance estimation
> > > > >
> > > > > one more boot to go, then I'll try to revert whichever causes my
> > > > > machine to perform horribly much worse.
> > > >
> > > > I guess it should come as no surprise that the result is
> > > >
> > > >    9c0b4bb7f6303c9c4e2e34984c46f5a86478f84d is the first bad commit
> > > >
> > > > but to revert cleanly I will have to revert all of
> > > >
> > > >       b3edde44e5d4 ("cpufreq/schedutil: Use a fixed reference frequency")
> > > >       f12560779f9d ("sched/cpufreq: Rework iowait boost")
> > > >       9c0b4bb7f630 ("sched/cpufreq: Rework schedutil governor
> > > > performance estimation")
> > > >
> > > > This is on a 32-core (64-thread) AMD Ryzen Threadripper 3970X, fwiw.
> > > >
> > > > I'll keep that revert in my private test-tree for now (so that I have
> > > > a working machine again), but I'll move it to my main branch soon
> > > > unless somebody has a quick fix for this problem.
> > >
> > > Hi Linus,
> > >
> > > I'm able to reproduce this issue with my AMD Ryzen 5600G system.  But
> > > only if I disable CPPC in BIOS and boot with acpi-cpufreq + schedutil.
> > > (I believe for your case also CPPC is diabled as log "_CPC object is not
> > > present" came). Enabling CPPC in BIOS issue not seen in my system.  For
> > > AMD acpi-cpufreq also uses _CPC object to determine the boost ratio.
> > > When CPPC is disabled in BIOS something is going wrong and max
> > > capacity is becoming zero.
> > >
> > > Hi Vincent, Qais,
> > >

..

> >
> > There is something strange that I don't understand
> >
> > Could you trace on the return of sugov_get_util()
> > the value of sg_cpu->util ?
>
> Yeah, correct something was wrong in the bpftrace readings, max_cap is
> not zero in traces.
>
>              git-5511    [001] d.h1.   427.159763: get_next_freq.constprop.0: [DEBUG] : freq 1400000, util 1024, max 1024
>              git-5511    [001] d.h1.   427.163733: sugov_get_util: [DEBUG] : util 1024, sg_cpu->util 1024
>              git-5511    [001] d.h1.   427.163735: get_next_freq.constprop.0: [DEBUG] : freq 1400000, util 1024, max 1024
>              git-5511    [001] d.h1.   427.167706: sugov_get_util: [DEBUG] : util 1024, sg_cpu->util 1024
>              git-5511    [001] d.h1.   427.167708: get_next_freq.constprop.0: [DEBUG] : freq 1400000, util 1024, max 1024
>              git-5511    [001] d.h1.   427.171678: sugov_get_util: [DEBUG] : util 1024, sg_cpu->util 1024
>              git-5511    [001] d.h1.   427.171679: get_next_freq.constprop.0: [DEBUG] : freq 1400000, util 1024, max 1024
>              git-5511    [001] d.h1.   427.175653: sugov_get_util: [DEBUG] : util 1024, sg_cpu->util 1024
>              git-5511    [001] d.h1.   427.175655: get_next_freq.constprop.0: [DEBUG] : freq 1400000, util 1024, max 1024
>              git-5511    [001] d.s1.   427.175665: sugov_get_util: [DEBUG] : util 1024, sg_cpu->util 1024
>              git-5511    [001] d.s1.   427.175665: get_next_freq.constprop.0: [DEBUG] : freq 1400000, util 1024, max 1024
>
> Debug patch applied:
>
> diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> index 95c3c097083e..5c9b3e1de7a0 100644
> --- a/kernel/sched/cpufreq_schedutil.c
> +++ b/kernel/sched/cpufreq_schedutil.c
> @@ -166,6 +166,7 @@ static unsigned int get_next_freq(struct sugov_policy *sg_policy,
>
>         freq = get_capacity_ref_freq(policy);
>         freq = map_util_freq(util, freq, max);
> +       trace_printk("[DEBUG] : freq %llu, util %llu, max %llu\n", freq, util, max);
>
>         if (freq == sg_policy->cached_raw_freq && !sg_policy->need_freq_update)
>                 return sg_policy->next_freq;
> @@ -199,6 +200,7 @@ static void sugov_get_util(struct sugov_cpu *sg_cpu, unsigned long boost)
>         util = max(util, boost);
>         sg_cpu->bw_min = min;
>         sg_cpu->util = sugov_effective_cpu_perf(sg_cpu->cpu, util, min, max);
> +       trace_printk("[DEBUG] : util %llu, sg_cpu->util %llu\n", util, sg_cpu->util);
>  }
>
>  /**
>
>
> So, I guess map_util_freq going wrong somewhere.

Thanks for the trace. It was really helpful and I think that I got the
root cause.

The problem comes from get_capacity_ref_freq() which returns current
freq when arch_scale_freq_invariant() is not enable, and the fact that
we apply map_util_perf() earlier in the path now which is then capped
by max capacity.

Could you try the below ?

diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index e420e2ee1a10..611c621543f4 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -133,7 +133,7 @@ unsigned long get_capacity_ref_freq(struct
cpufreq_policy *policy)
        if (arch_scale_freq_invariant())
                return policy->cpuinfo.max_freq;

-       return policy->cur;
+       return policy->cur + policy->cur >> 2;
 }

 /**



>
> Thanks,
> Wyes
> >
> > Thanks for you help
> > Vincent
> >
> > >
> > > Thanks,
> > > Wyes
> > >

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ