lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 15 Sep 2015 09:43:39 +0100
From:	Morten Rasmussen <morten.rasmussen@....com>
To:	bsegall@...gle.com
Cc:	Yuyang Du <yuyang.du@...el.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Dietmar Eggemann <dietmar.eggemann@....com>,
	Vincent Guittot <vincent.guittot@...aro.org>,
	Steve Muckle <steve.muckle@...aro.org>,
	"mingo@...hat.com" <mingo@...hat.com>,
	"daniel.lezcano@...aro.org" <daniel.lezcano@...aro.org>,
	"mturquette@...libre.com" <mturquette@...libre.com>,
	"rjw@...ysocki.net" <rjw@...ysocki.net>,
	Juri Lelli <Juri.Lelli@....com>,
	"sgurrappadi@...dia.com" <sgurrappadi@...dia.com>,
	"pang.xunlei@....com.cn" <pang.xunlei@....com.cn>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 5/6] sched/fair: Get rid of scaling utilization by
 capacity_orig

On Mon, Sep 14, 2015 at 10:34:00AM -0700, bsegall@...gle.com wrote:
> Morten Rasmussen <morten.rasmussen@....com> writes:
> 
> > On Fri, Sep 11, 2015 at 10:05:53AM -0700, bsegall@...gle.com wrote:
> >> Morten Rasmussen <morten.rasmussen@....com> writes:
> >> 
> >> > On Fri, Sep 11, 2015 at 08:28:25AM +0800, Yuyang Du wrote:
> >> >> diff --git a/include/linux/sched.h b/include/linux/sched.h
> >> >> index 119823d..55a7b93 100644
> >> >> --- a/include/linux/sched.h
> >> >> +++ b/include/linux/sched.h
> >> >> @@ -912,7 +912,7 @@ enum cpu_idle_type {
> >> >>  /*
> >> >>   * Increase resolution of cpu_capacity calculations
> >> >>   */
> >> >> -#define SCHED_CAPACITY_SHIFT	10
> >> >> +#define SCHED_CAPACITY_SHIFT	SCHED_RESOLUTION_SHIFT
> >> >>  #define SCHED_CAPACITY_SCALE	(1L << SCHED_CAPACITY_SHIFT)
> >> >>  
> >> >>  /*
> >> >> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> >> >> index 68cda11..d27cdd8 100644
> >> >> --- a/kernel/sched/sched.h
> >> >> +++ b/kernel/sched/sched.h
> >> >> @@ -40,6 +40,9 @@ static inline void update_cpu_load_active(struct rq *this_rq) { }
> >> >>   */
> >> >>  #define NS_TO_JIFFIES(TIME)	((unsigned long)(TIME) / (NSEC_PER_SEC / HZ))
> >> >>  
> >> >> +# define SCHED_RESOLUTION_SHIFT	10
> >> >> +# define SCHED_RESOLUTION_SCALE	(1L << SCHED_RESOLUTION_SHIFT)
> >> >> +
> >> >>  /*
> >> >>   * Increase resolution of nice-level calculations for 64-bit architectures.
> >> >>   * The extra resolution improves shares distribution and load balancing of
> >> >> @@ -53,16 +56,15 @@ static inline void update_cpu_load_active(struct rq *this_rq) { }
> >> >>   * increased costs.
> >> >>   */
> >> >>  #if 0 /* BITS_PER_LONG > 32 -- currently broken: it increases power usage under light load  */
> >> >> -# define SCHED_LOAD_RESOLUTION	10
> >> >> -# define scale_load(w)		((w) << SCHED_LOAD_RESOLUTION)
> >> >> -# define scale_load_down(w)	((w) >> SCHED_LOAD_RESOLUTION)
> >> >> +# define SCHED_LOAD_SHIFT	(SCHED_RESOLUTION_SHIFT + SCHED_RESOLUTION_SHIFT)
> >> >> +# define scale_load(w)		((w) << SCHED_RESOLUTION_SHIFT)
> >> >> +# define scale_load_down(w)	((w) >> SCHED_RESOLUTION_SHIFT)
> >> >>  #else
> >> >> -# define SCHED_LOAD_RESOLUTION	0
> >> >> +# define SCHED_LOAD_SHIFT	(SCHED_RESOLUTION_SHIFT)
> >> >>  # define scale_load(w)		(w)
> >> >>  # define scale_load_down(w)	(w)
> >> >>  #endif
> >> >>  
> >> >> -#define SCHED_LOAD_SHIFT	(10 + SCHED_LOAD_RESOLUTION)
> >> >>  #define SCHED_LOAD_SCALE	(1L << SCHED_LOAD_SHIFT)
> >> >>  
> >> >>  #define NICE_0_LOAD		SCHED_LOAD_SCALE
> >> >
> >> > I think this is pretty much the required relationship between all the
> >> > SHIFTs and SCALEs that Peter checked for in his #if-#error thing
> >> > earlier, so no disagreements from my side :-)
> >> > --
> >> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> >> > the body of a message to majordomo@...r.kernel.org
> >> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >> > Please read the FAQ at  http://www.tux.org/lkml/
> >> 
> >> SCHED_LOAD_RESOLUTION and the non-SLR part of SCHED_LOAD_SHIFT are not
> >> required to be the same value and should not be conflated.
> >> 
> >> In particular, since cgroups are on the same timeline as tasks and their
> >> shares are not scaled by SCHED_LOAD_SHIFT in any way (but are scaled so
> >> that SCHED_LOAD_RESOLUTION is invisible), changing that part of
> >> SCHED_LOAD_SHIFT would cause issues, since things can assume that nice-0
> >> = 1024. However changing SCHED_LOAD_RESOLUTION would be fine, as that is
> >> an internal value to the kernel.
> >> 
> >> In addition, changing the non-SLR part of SCHED_LOAD_SHIFT would require
> >> recomputing all of prio_to_weight/wmult for the new NICE_0_LOAD.
> >
> > I think I follow, but doesn't that mean that the current code is broken
> > too? NICE_0_LOAD changes if you change SCHED_LOAD_RESOLUTION:
> >
> > #define SCHED_LOAD_SHIFT        (10 + SCHED_LOAD_RESOLUTION)
> > #define SCHED_LOAD_SCALE        (1L << SCHED_LOAD_SHIFT)
> >
> > #define NICE_0_LOAD             SCHED_LOAD_SCALE
> > #define NICE_0_SHIFT            SCHED_LOAD_SHIFT
> >
> > To me it sounds like we need to define it the other way around:
> >
> > #define NICE_0_SHIFT            10
> > #define NICE_0_LOAD             (1L << NICE_0_SHIFT)
> >
> > and then add any additional resolution bits from there to ensure that
> > NICE_0_LOAD and the prio_to_weight/wmult tables are unchanged.
> 
> No, NICE_0_LOAD is supposed to be scale_load(prio_to_weight[nice_0]),
> ie including SLR. It has never been clear to me what
> SCHED_LOAD_SCALE/SCHED_LOAD_SHIFT were for as opposed to NICE_0_LOAD,

I see, I wasn't sure if NICE_0_LOAD is being used in the code somewhere
with the assumption that NICE_0_LOAD = load.weight = 1024. The
scale_(down_)_load() conversion between base load (nice_0 = 1024) and
hi-res load makes makes sense.

> and the new utilization uses of it are entirely unlinked to 1024 == NICE_0

Yes, agreed. For utilization we just need to define some fixed point
resolution (as Yuyang said). That resolution is independent of the hi-res
load additional bits and should remain so. The same fixed point
resolution has to be used for capacity as well unless we want to
introduce scale_(down_)_capacity() functions to allow utilization to be
compared to capacity.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ