lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20150602002540.GB522@intel.com>
Date:	Tue, 2 Jun 2015 08:25:40 +0800
From:	Yuyang Du <yuyang.du@...el.com>
To:	mingo@...nel.org, peterz@...radead.org,
	linux-kernel@...r.kernel.org
Cc:	pjt@...gle.com, bsegall@...gle.com, morten.rasmussen@....com,
	vincent.guittot@...aro.org, dietmar.eggemann@....com,
	arjan.van.de.ven@...el.com, len.brown@...el.com,
	rafael.j.wysocki@...el.com, fengguang.wu@...el.com
Subject: Re: [PATCH v8 0/4] sched: Rewrite runnable load and utilization
 average tracking

Ping once more...

On Mon, May 25, 2015 at 09:49:43AM +0800, Yuyang Du wrote:
> Hi Peter and Ingo,
> 
> Changes are made for the 8th version:
> 
> 1) Rebase to the latest tip tree
> 2) scale_load_down the weight when doing the averages
> 3) change util_sum to u32
> 
> Thanks a lot for Ben's comments, which lead to this version.
> 
> Regards,
> Yuyang
> 
> v7 changes:
> 
> The 7th version mostly is to accommodate the utilization load average recently
> merged into kernel. The general idea is as well to update the cfs_rq as a whole
> as opposed to only updating an entity at a time and update the cfs_rq with the
> only updated entity.
> 
> 1) Rename utilization_load_avg to util_avg to be concise and meaningful
> 
> 2) To track the cfs_rq util_avg, simply use "cfs_rq->curr != NULL" as the
> predicate. This should be equivalent to but simpler than aggregating each
> individual child sched_entity's util_avg when "cfs_rq->curr == se". Because
> if cfs_rq->curr != NULL, the cfs_rq->curr has to be some se.
> 
> 3) Remove se's util_avg from its cfs_rq's when migrating it, this was already
> proposed by Morten and patches sent
> 
> 3) The group entity's load average is initiated when the entity is created
> 
> 4) Small nits: the entity's util_avg is removed from switched_from_fair()
> and task_move_group_fair().
> 
> Thanks a lot for Vincent and Morten's help for the 7th version.
> 
> Thanks,
> Yuyang
> 
> v6 changes:
> 
> Many thanks to PeterZ for his review, to Dietmar, and to Fengguang for 0Day and LKP.
> 
> Rebased on v3.18-rc2.
> 
> - Unify decay_load 32 and 64 bits by mul_u64_u32_shr
> - Add force option in update_tg_load_avg
> - Read real-time cfs's load_avg for calc_tg_weight
> - Have tg_load_avg_contrib ifdef CONFIG_FAIR_GROUP_SCHED
> - Bug fix
> 
> v5 changes:
> 
> Thank Peter intensively for reviewing this patchset in detail and all his comments.
> And Mike for general and cgroup pipe-test. Morten, Ben, and Vincent in the discussion.
> 
> - Remove dead task and task group load_avg
> - Do not update trivial delta to task_group load_avg (threshold 1/64 old_contrib)
> - mul_u64_u32_shr() is used in decay_load, so on 64bit, load_sum can afford
>   about 4353082796 (=2^64/47742/88761) entities with the highest weight (=88761)
>   always runnable, greater than previous theoretical maximum 132845
> - Various code efficiency and style changes
> 
> We carried out some performance tests (thanks to Fengguang and his LKP). The results
> are shown as follows. The patchset (including threepatches) is on top of mainline
> v3.16-rc5. We may report more perf numbers later.
> 
> Overall, this rewrite has better performance, and reduced net overhead in load
> average tracking, flat efficiency in multi-layer cgroup pipe-test.
> 
> v4 changes:
> 
> Thanks to Morten, Ben, and Fengguang for v4 revision.
> 
> - Insert memory barrier before writing cfs_rq->load_last_update_copy.
> - Fix typos.
> 
> v3 changes:
> 
> Many thanks to Ben for v3 revision.
> 
> Regarding the overflow issue, we now have for both entity and cfs_rq:
> 
> struct sched_avg {
>     .....
>     u64 load_sum;
>     unsigned long load_avg;
>     .....
> };
> 
> Given the weight for both entity and cfs_rq is:
> 
> struct load_weight {
>     unsigned long weight;
>     .....
> };
> 
> So, load_sum's max is 47742 * load.weight (which is unsigned long), then on 32bit,
> it is absolutly safe. On 64bit, with unsigned long being 64bit, but we can afford
> about 4353082796 (=2^64/47742/88761) entities with the highest weight (=88761)
> always runnable, even considering we may multiply 1<<15 in decay_load64, we can
> still support 132845 (=4353082796/2^15) always runnable, which should be acceptible.
> 
> load_avg = load_sum / 47742 = load.weight (which is unsigned long), so it should be
> perfectly safe for both entity (even with arbitrary user group share) and cfs_rq on
> both 32bit and 64bit. Originally, we saved this division, but have to get it back
> because of the overflow issue on 32bit (actually load average itself is safe from
> overflow, but the rest of the code referencing it always uses long, such as cpu_load,
> etc., which prevents it from saving).
> 
> - Fix overflow issue both for entity and cfs_rq on both 32bit and 64bit.
> - Track all entities (both task and group entity) due to group entity's clock issue.
>   This actually improves code simplicity.
> - Make a copy of cfs_rq sched_avg's last_update_time, to read an intact 64bit
>   variable on 32bit machine when in data race (hope I did it right).
> - Minor fixes and code improvement.
> 
> v2 changes:
> 
> Thanks to PeterZ and Ben for their help in fixing the issues and improving
> the quality, and Fengguang and his 0Day in finding compile errors in different
> configurations for version 2.
> 
> - Batch update the tg->load_avg, making sure it is up-to-date before update_cfs_shares
> - Remove migrating task from the old CPU/cfs_rq, and do so with atomic operations
> 
> Yuyang Du (4):
>   sched: Remove rq's runnable avg
>   sched: Rewrite runnable load and utilization average tracking
>   sched: Init cfs_rq's sched_entity load average
>   sched: Remove task and group entity load when they are dead
> 
>  include/linux/sched.h |  40 ++-
>  kernel/sched/core.c   |   5 +-
>  kernel/sched/debug.c  |  42 +---
>  kernel/sched/fair.c   | 668 +++++++++++++++++---------------------------------
>  kernel/sched/sched.h  |  32 +--
>  5 files changed, 261 insertions(+), 526 deletions(-)
> 
> -- 
> 2.1.3
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ