lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 26 Jan 2021 10:09:27 +0100
From:   Giovanni Gherdovich <ggherdovich@...e.cz>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Borislav Petkov <bp@...en8.de>, Ingo Molnar <mingo@...hat.com>,
        "Rafael J . Wysocki" <rjw@...ysocki.net>,
        Viresh Kumar <viresh.kumar@...aro.org>,
        Jon Grimm <Jon.Grimm@....com>,
        Nathan Fontenot <Nathan.Fontenot@....com>,
        Yazen Ghannam <Yazen.Ghannam@....com>,
        Thomas Lendacky <Thomas.Lendacky@....com>,
        Suthikulpanit Suravee <Suravee.Suthikulpanit@....com>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Pu Wen <puwen@...on.cn>, Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Michael Larabel <Michael@...ronix.com>, x86@...nel.org,
        linux-pm@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-acpi@...r.kernel.org
Subject: Re: [PATCH v2 1/1] x86,sched: On AMD EPYC set freq_max = max_boost
 in schedutil invariant formula

On Mon, 2021-01-25 at 11:06 +0100, Peter Zijlstra wrote:
> On Fri, Jan 22, 2021 at 09:40:38PM +0100, Giovanni Gherdovich wrote:
> > 1. PROBLEM DESCRIPTION (over-utilization and schedutil)
> > 
> > The problem happens on CPU-bound workloads spanning a large number of cores.
> > In this case schedutil won't select the maximum P-State. Actually, it's
> > likely that it will select the minimum one.
> > 
> > A CPU-bound workload puts the machine in a state generally called
> > "over-utilization": an increase in CPU speed doesn't result in an increase of
> > capacity. The fraction of time tasks spend on CPU becomes constant regardless
> > of clock frequency (the tasks eat whatever we throw at them), and the PELT
> > invariant util goes up and down with the frequency (i.e. it's not invariant
> > anymore).
> >                                       v5.10          v5.11-rc4
> >                                       ~~~~~~~~~~~~~~~~~~~~~~~~
> > CPU activity (mpstat)                 80-90%         80-90%
> > schedutil requests (tracepoint)       always P0      mostly P2
> > CPU frequency (HW feedback)           ~2.2 GHz       ~1.5 GHz
> > PELT root rq util (tracepoint)        ~825           ~450
> > 
> > mpstat shows that the workload is CPU-bound and usage doesn't change with
> 
> So I'm having trouble with calling a 80%-90% workload CPU bound, because
> clearly there's a ton of idle time.

Yes you're right. There is considerable idle time and calling it CPU-bound is
a bit of a stretch.

Yet I don't think I'm completely off the mark. The busy time is the same with
the machine running at 1.5 GHz and at 2.2 GHz (it just takes longer to
finish). To me it seems like the CPU is the bottleneck, with some overhead on
top.

I will confirm what causes the idle time.


Giovanni

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ