linux-kernel - Re: Divide-by-zero in post_init_entity_util

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160616122504.GG30927@twins.programming.kicks-ass.net>
Date:	Thu, 16 Jun 2016 14:25:04 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	Yuyang Du <yuyang.du@...el.com>
Cc:	Chris Wilson <chris@...is-wilson.co.uk>,
	Andrey Ryabinin <aryabinin@...tuozzo.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Mike Galbraith <efault@....de>,
	Thomas Gleixner <tglx@...utronix.de>, bsegall@...gle.com,
	morten.rasmussen@....com, pjt@...gle.com, steve.muckle@...aro.org,
	linux-kernel@...r.kernel.org, kernel@...p.com
Subject: Re: Divide-by-zero in post_init_entity_util_avg

On Thu, Jun 16, 2016 at 10:50:40AM +0200, Peter Zijlstra wrote:
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index f75930bdd326..3fd3d903e6b6 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -2878,6 +2878,20 @@ static inline void cfs_rq_util_change(struct cfs_rq *cfs_rq)
>  	}
>  }
>  
> +/*
> + * Explicitly do a load-store to ensure the temporary value never hits memory.
> + * This allows lockless observations without ever seeing the negative values.
> + *
> + * Incidentally, this also generates much saner code for x86.
> + */
> +#define sub_positive(type, ptr, val) do {			\
> +	type tmp = READ_ONCE(*ptr);				\
> +	tmp -= (val);						\
> +	if (tmp < 0)						\
> +		tmp = 0;					\
> +	WRITE_ONCE(*ptr, tmp);					\
> +} while (0)
> +
>  /* Group cfs_rq's load_avg is used for task_h_load and update_cfs_share */
>  static inline int
>  update_cfs_rq_load_avg(u64 now, struct cfs_rq *cfs_rq, bool update_freq)
> @@ -2887,15 +2901,15 @@ update_cfs_rq_load_avg(u64 now, struct cfs_rq *cfs_rq, bool update_freq)
>  
>  	if (atomic_long_read(&cfs_rq->removed_load_avg)) {
>  		s64 r = atomic_long_xchg(&cfs_rq->removed_load_avg, 0);
> -		sa->load_avg = max_t(long, sa->load_avg - r, 0);
> -		sa->load_sum = max_t(s64, sa->load_sum - r * LOAD_AVG_MAX, 0);
> +		sub_positive(long, &sa->load_avg, r);
> +		sub_positive(s64,  &sa->load_sum, r * LOAD_AVG_MAX);

Hmm, so either we should change these variables to signed types as
forced here, or this logic (along with the former) is plain wrong.

As it stands any unsigned value with the MSB set will wipe the field
after this subtraction.

I suppose instead we'd want something like:

	tmp = READ_ONCE(*ptr);
	if (tmp > val)
	  tmp -= val;
	else
	  tmp = 0;
	WRITE_ONCE(*ptr, tmp);

In order to generate:

  xchg   %rax,0xa0(%r13)
  mov    0x78(%r13),%rcx
  sub    %rax,%rcx
  cmovae %r15,%rcx
  mov    %rcx,0x78(%r13)

however, GCC isn't smart enough and generates:

  xchg   %rax,0x98(%r13)
  mov    0x70(%r13),%rsi
  mov    %rsi,%rcx
  sub    %rax,%rcx
  cmp    %rsi,%rax
  cmovae %r15,%rcx
  mov    %rcx,0x70(%r13)

Doing a CMP with the _same_ values it does the SUB with, resulting in
exactly the same CC values.

(this is with gcc-5.3, I'm still trying to build gcc-6.1 from the debian
package which I suppose I should just give up and do a source build)

Opinions?