lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <b378f48593ca7449257a1bb55e78b186d88cd9f1.camel@surriel.com>
Date: Mon, 14 Apr 2025 15:57:42 -0400
From: Rik van Riel <riel@...riel.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Chris Mason <clm@...a.com>, Pat Cody <pat@...cody.io>, mingo@...hat.com,
 	juri.lelli@...hat.com, vincent.guittot@...aro.org,
 dietmar.eggemann@....com, 	rostedt@...dmis.org, bsegall@...gle.com,
 mgorman@...e.de, vschneid@...hat.com, 	linux-kernel@...r.kernel.org,
 patcody@...a.com, kernel-team@...a.com, Breno Leitao	 <leitao@...ian.org>
Subject: Re: [PATCH] sched/fair: Add null pointer check to pick_next_entity()

On Wed, 2025-04-02 at 10:22 +0200, Peter Zijlstra wrote:
> 
> Please confirm what the reason for overflow is.
> 
Running a large enough sample size has its benefits.

We have hit 3 out of the 4 warnings below.

The only one we did not hit is the cfs_rq->avg_load != avg_load
warning.

Most of the time we seem to hit the warnings from the
code that removes tasks from the runqueue, but we are
occasionally seeing it when adding tasks to the runqueue,
as well.


> +++ b/kernel/sched/fair.c
> @@ -620,12 +620,36 @@ static inline s64 entity_key(struct cfs_
>   *
>   * As measured, the max (key * weight) value was ~44 bits for a
> kernel build.
>   */
> +
> +static void avg_vruntime_validate(struct cfs_rq *cfs_rq)
> +{
> +	unsigned long load = 0;
> +	s64 vruntime = 0;
> +
> +	for (struct rb_node *node = rb_first_cached(&cfs_rq-
> >tasks_timeline);
> +	     node; node = rb_next(node)) {
> +		struct sched_entity *se = __node_2_se(node);
> +		unsigned long weight = scale_load_down(se-
> >load.weight);
> +		s64 key = entity_key(cfs_rq, se);
> +		/* vruntime += key * weight; */
> +		WARN_ON_ONCE(__builtin_mul_overflow(key, weight,
> &key));
> +		WARN_ON_ONCE(__builtin_add_overflow(vruntime, key,
> &vruntime));
> +		load += weight;
> +	}
> +
> +	WARN_ON_ONCE(cfs_rq->avg_load != load);
> +	WARN_ON_ONCE(cfs_rq->avg_vruntime != vruntime);
> +}
> +
>  static void
>  avg_vruntime_add(struct cfs_rq *cfs_rq, struct sched_entity *se)
>  {
>  	unsigned long weight = scale_load_down(se->load.weight);
>  	s64 key = entity_key(cfs_rq, se);
>  
> +	/* not yet added to tree */
> +	avg_vruntime_validate(cfs_rq);
> +
>  	cfs_rq->avg_vruntime += key * weight;
>  	cfs_rq->avg_load += weight;
>  }
> @@ -638,6 +662,9 @@ avg_vruntime_sub(struct cfs_rq *cfs_rq,
>  
>  	cfs_rq->avg_vruntime -= key * weight;
>  	cfs_rq->avg_load -= weight;
> +
> +	/* already removed from tree */
> +	avg_vruntime_validate(cfs_rq);
>  }
>  
>  static inline
> 

-- 
All Rights Reversed.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ