linux-kernel - Re: [PATCH v3 03/12] sched/fair: Change the variable to hold the number of periods to 32bit

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160510020527.GS16093@intel.com>
Date:	Tue, 10 May 2016 10:05:27 +0800
From:	Yuyang Du <yuyang.du@...el.com>
To:	Morten Rasmussen <morten.rasmussen@....com>
Cc:	peterz@...radead.org, mingo@...nel.org,
	linux-kernel@...r.kernel.org, bsegall@...gle.com, pjt@...gle.com,
	vincent.guittot@...aro.org, dietmar.eggemann@....com,
	juri.lelli@....com
Subject: Re: [PATCH v3 03/12] sched/fair: Change the variable to hold the
 number of periods to 32bit

On Tue, May 10, 2016 at 10:10:21AM +0100, Morten Rasmussen wrote:

> > > > 2. If m < 32*64, the chance to be here is very very low, but if so,
> > > 
> > > Should that be: n < 32*64 ?

Sorry, I overlooked this comment. Yes, it should be n < 32*64.

> > > 
> > > Talking about 32*64, I don't get why we don't use LOAD_AVG_MAX_N. I had
> > > a patch ready to post for that:
> > >
> > > >From 5055e5f82c8d207880035c2ec4ecf1ac1e7f9e91 Mon Sep 17 00:00:00 2001
> > > From: Morten Rasmussen <morten.rasmussen@....com>
> > > Date: Mon, 11 Apr 2016 15:41:37 +0100
> > > Subject: [PATCH] sched/fair: Fix decay to zero period in decay_load()
> > > 
> > > In __compute_runnable_contrib() we are happy with returning LOAD_AVG_MAX
> > > when the update period n >= LOAD_AVG_MAX_N (=345), so we should be happy
> > > with returning zero for n >= LOAD_AVG_MAX_N when decaying in
> > > decay_load() as well instead of only returning zero for n >
> > > LOAD_AVG_PERIOD * 63 (=2016).
> > 
> > So basically, you want to add another rule in addition to the exponential
> > decay rule.
> 
> No, not at all. I want to make the 'rules' symmetrical for accumulation
> and decay exactly like the patch does.

"Make the rule xxx" != change rule or add rule?
 
> > > >    the task's sched avgs MAY NOT be decayed to 0, depending on how
> > > >    big its sums are, and the chance to 0 is still good as load_sum
> > > >    is way less than ~0ULL and util_sum way less than ~0U.
> > > 
> > > I don't get the last bit about load_sum < ~0ULL and util_sum < ~0U.
> > > Whether you get to zero depends on the sums (as you say) and the actual
> > > value of 'n'. It is true that you might get to zero even if n <
> > > LOAD_AVG_MAX_N if the sums are small.
> >  
> > Frankly, util Ben brought it up, I didn't think a task sleeping so long
> > is even possible. But I do admit it may happen.
> > 
> > However, I will say this. A task sleeping so long is already very rare,
> > and among all those long sleep cases, the chance that after waking up the
> > avgs will not be decayed to zero is much less than 0.5 in a million
> > (=32*64/2^32=1/2^21), assuming the sleeping time is uniformly distributed.
> 
> You can't just ignore cases because they have a low probability. Going
> by that logic we could remove a lot of synchronization overhead in the
> kernel.
> 
> My concern is whether we introduce any assumptions that might hit us
> later when everybody has forgotten about them. This one would be
> extremely hard to debug later.

_NO_, that was just saying chance is very low. No any intent to say nor did I
say low chance doesn't matter. That was why I said the following paragraph.
 
> > > > Nevertheless, what really maters is what happens in the worst-case
> > > > scenario, which is when (u32)m = 0? So in that case, it would be like
> > > > after so long a sleep, we treat the task as it never slept, and has the
> > > > same sched averages as before it slept. That is actually not bad or
> > > > nothing to worry about, and think of it as the same as how we treat
> > > > a new born task.
> > > 
> > > There is subtle but important difference between not decaying a task
> > > correctly and adding new task: The sleeping task is already accounted
> > > for in the cfs_rq.avg.{load,util}_sum. The sleeping task's contribution
> > > to cfs_rq.avg has been decayed correctly in the mean time which means
> > > that by not guaranteeing a decay of the se.avg at wake-up you introduce
> > > a discrepancy between the task's owen view of its contribution (se.avg)
> > > and the cfs_rq view (cfs_rq.avg). That might lead to trouble if the task
> > > is migrated at wake-up as we remove se.avg amount of contribution from
> > > the previous cpu's cfs_rq.avg. In other words, you remove too much from
> > > the cfs_rq.avg.
> > > 
> > > The discrepancy will decay and disappear after LOAD_AVG_MAX_N ms, which
> > > might be acceptable, but it is not a totally harmless change IMHO.
> > 
> > That is just an explanation, :) nevermind, or I really don't think that is
> > a deal.
> 
> I don't really follow. I analysed the implications of the overflow that
> you are introducing. In my opinion this is what you should have done
> before proposing this patch. I think it is essential to understand what
> assumptions we make and introducing new ones should be carefully
> considered. I think it is a big 'deal' if you are not more careful when you
> are submitting patches and just ignore feedback. We spent a lot of time
> reviewing them.

So I agree "it is not a totally harmless change". But what is the deal/impact
of the harm? The harm in the worse case scenario will not hurt anything, IMHO,
and just an opinion.

Thank you very much for the rewiew. Really appreciate it.