lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20150527133033.GI26396@e105550-lin.cambridge.arm.com>
Date:	Wed, 27 May 2015 14:30:33 +0100
From:	Morten Rasmussen <morten.rasmussen@....com>
To:	Chao Xie <xiechao_mail@....com>
Cc:	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"peterz@...radead.org" <peterz@...radead.org>,
	"mingo@...hat.com" <mingo@...hat.com>,
	"vincent.guittot@...aro.org" <vincent.guittot@...aro.org>,
	Dietmar Eggemann <Dietmar.Eggemann@....com>,
	"yuyang.du@...el.com" <yuyang.du@...el.com>,
	  "mturquet.te@...aro.org" <mturquette@...aro.org>,
	"preeti@...ux.vnet.ibm.com" <preeti@...ux.vnet.ibm.com>,
	"rjw@...ysocki.net" <rjw@...ysocki.net>,
	Juri Lelli <Juri.Lelli@....com>,
	"linux-pm@...r.kernel.org" <linux-pm@...r.kernel.org>
Subject: Re: Re: Question about "Make sched entity usage tracking
 scale-invariant"

On Wed, May 27, 2015 at 02:49:40AM +0100, Chao Xie wrote:
> 
> At 2015-05-26 19:05:36, "Morten Rasmussen" <morten.rasmussen@....com> wrote:
> >Hi,
> >
> >[Adding maintainers and others to cc]
> >
> >On Mon, May 25, 2015 at 02:19:43AM +0100, Chao Xie wrote:
> >> hi
> >> I saw the patch “sched: Make sched entity usage tracking
> >> scale-invariant” that will make the usage to be freq scaled.
> >> So if delta period that the calculation of usage based on cross a
> >> frequency change, so how can you make sure the usage calculation is
> >> correct?
> >> The delta period may last hundreds of microseconds, and frequency
> >> change window may be 10-20 microseconds, so many frequency change can
> >> happen during the delta period.
> >> It seems the patch does not consider about it, and it just pick up the
> >> current one.
> >> So how can you resolve this issue?
> >
> >Right. We don't know how many times the frequency may have changed since
> >last time we updated the entity usage tracking for the particular
> >entity. All we do is to call arch_scale_freq_capacity() and use that
> >scaling factor to compensate for whatever changes might have taken
> >place.
> >
> >The easiest implementation of arch_scale_freq_capacity() for most
> >architectures is to just return a scaling factor computed based on the
> >current frequency and ignoring when exactly the change happened and
> >ignoring if multiple changes happened. Depending on how often the
> >frequency might change this might be an acceptable approximation. While
> >the task is running the sched tick will update the entity usage tracking
> >(every 10ms by default on most ARM systems), hence we shouldn't be more
> >than a tick off in term of when the frequency change is accounted for.
> >Under normal circumstances the delta period should therefore be <10ms
> >and generally shorter than that if you have more than one task runnable
> >on the cpu or the task(s) are not always-running. It is not perfect but
> >it is a lot better than the utilization tracking currently used by
> >cpufreq governors and better than the scheduler being completely unaware
> >of frequency scaling.
> >
> >For systems with very frequent frequency changes, i.e. fast hardware and
> >an aggressive governor leading to multiple changes in less than 10ms,
> >the solution above might not be sufficient. In that case, I think a
> >better solution might be to track the average frequency using hardware
> >counters or whatever tracking metrics the system might have to let
> >arch_scale_freq_capacity() return the average performance delivered over
> >the most recent period of time. AFAIK, x86 already has performance
> >counters (APERF/MPERF) that could be used for this purpose. The delta
> >period for each entity tracking update isn't fixed, but it might
> >sufficient to just average over some fixed period of time. Accurate
> >tracking would require some time-stamp information to be stored in each
> >sched_entity for the true average to be computed for the delta period.
> >That quickly becomes rather messy but not impossible. I did look at it
> >briefly a while back, but decided not to go down that route until we
> >know that using current frequency or some fixed period average isn't
> >going to be sufficient. Usage or utilization is and average of something
> >that might be constantly changing anyways, so it never going to be very
> >accurate anyway. If it does turn out that we can't get the overall
> >picture right, we will need to improve it.
> >
> >Updating the entity tracking for each frequency change adds to much
> >overhead I think and seems unnecessary if we do with an average scaling
> >factor.
> >
> >I hope that answers your question. Have you observed any problems with
> >the usage tracking?
> >
> 
> Thanks for the explanation.
>
> I agree that the "delta" is less than 10ms at most situation, but i
> think at least one period need to be considered.  If the frequency
> change happens just a little, for example, 10us before the task start
> to calculate its utilization which may have a delta of 10ms. The
> almost whole delta will be calculated based on new frequency, not the
> old one. The frequency change can be from the lowest to highest, so
> for this time the delta calculation has big deviation, and this
> situation is not rare.

Letting arch_scale_freq_capacity() return some average frequency over
the last tick period should at least smooth things out a bit.

Also worth noting is that this problem of frequency changes being out of
phase relative to the scheduler ticks might be significantly reduced
(maybe go away entirely?) if we make frequency changes event-driven from
the scheduler. Since frequency changes would only be initiated from the
scheduler the load-tracking should be up to date whenever a frequency
change is requested and hence the scenario above shouldn't be possible.
Scheduler/dvfs integration is still being discussed though. You may want
to have a look in discussion of Mike Turquette's patches if you are
interested.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ