lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20151012174532.GB1113@lerouge>
Date:	Mon, 12 Oct 2015 19:45:35 +0200
From:	Frederic Weisbecker <fweisbec@...il.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	Byungchul Park <byungchul.park@....com>, mingo@...nel.org,
	linux-kernel@...r.kernel.org, tglx@...utronix.de
Subject: Re: [PATCH v3 2/2] sched: consider missed ticks when updating global
 cpu load

On Mon, Oct 05, 2015 at 10:15:55AM +0200, Peter Zijlstra wrote:
> On Sun, Oct 04, 2015 at 03:58:19PM +0900, Byungchul Park wrote:
> > anyway, it's wrong for update_process_times() to assume 1 tick because
> > tick_irq_exit() -> tick_nohz_irq_exit() -> tick_nohz_full_update_tick()
> > -> tick_nohz_restart_sched_tick() can happen at full NOHZ as i already
> > said. in this full NOHZ case for tick to restart from non-idle,
> 
> NO_HZ_FULL is very much a work in progress, there's plenty wrong with
> it. But yes, if it does this then its broken here too, I'm not sure if
> Frederic is aware of this or not (I'm sure he's got a fairly big list of
> broken for NO_HZ_FULL).

Indeed and cpu load active is part of what needs to be fixed. I hope this
patchset will help.

> 
> > 1. update_process_times() -> account_process_tick() must be able to handle
> > more than one tick, or tick_nohz_restart_sched_tick() should handle the
> > case additionally. (i think the latter is better.) i will try to modify
> > the code to handle it if you agree with me.
> 
> Yes, and we need to audit all the other stuff called from
> update_process_times().
> 
> run_local_timers() seems be ok.
> rcu_check_clalbacks() also doesn't seem to care about ticks.
> 
> I _think_ we fixed most of the scheduler_tick()
> stuff (under the assumption that TSC is stable), but I'm not sure.

Concerning the variable pending ticks, we are fine with update_process_times()
except a few stuff in scheduler_tick():

* cpu load active
* sched_avg_update() handles well missed ticks as it's based on rq clock
  and specific period for updates. But I'm worried about remote reads of rt_avg,
  if any.
* calc_global_load_tick(), not sure about this one
* trigger_load_balance()
* the infamous task_tick() :-)

But load avg appears to me as a pretty standalone issue. So are each of these small
issues.

> 
> and run_posix_cpu_timers() might also be ok.
> 
> > 2. to handle full NOHZ, tick_nohz_restart_sched_tick() should call
> > update_cpu_load_active() instead of update_cpu_load_nohz() with my 1/2
> > patch and 2/2 patch, or we should modify update_cpu_load_nohz() to know
> > full NOHZ, which currently don't know full NOHZ. (you may agree with the
> > latter.) in any case, 1/2 patch is necessary which current code is
> > absolutely missing.
> > 
> > peter, what do you think about my opinion? and about my 1/2 patch?
> 
> I did not look too closely, but it might have the right shape for
> dealing with !idle ticks. I'd have to look more closely at it.
> 
> > i will modify 2/2 patch depending on your feedback.
> 
> I think it will take more than a single patch to rework all of
> update_process_times(). And we should also ask Thomas for his opinion,
> but I think we want:
> 
> 	- make update_process_times() take a nr_ticks argument
> 	  - fixup everything below it
> 
> 	- fix tick_nohz_handler to not ignore the hrtimer_forward()
> 	  return value and pass it into
> 	  tick_sched_handle()/update_process_times().
> 
> 	  (assuming this is the right oneshot tick part, tick-common
> 	  seems to be about periodic timers which aren't used much ?!)

this_nohz_handler() is the low res nohz handler. tick_sched_handle()
is the high res one (I should rename these). I think we should rather
find out the pending updates from update_process_times() itself and pass
it to scheduler_tick() which is the one interested in it.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ