lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 6 Nov 2014 11:24:59 -0600 (CST)
From:	Christoph Lameter <>
To:	Thomas Gleixner <>
cc:	Frederic Weisbecker <>,,
	Gilad Ben-Yossef <>,
	Tejun Heo <>,
	John Stultz <>,
	Mike Frysinger <>,
	Minchan Kim <>,
	Hakan Akkan <>,
	Max Krasnyansky <>,
	"Paul E. McKenney" <>,
	Hugh Dickins <>,
	Viresh Kumar <>,
	"H. Peter Anvin" <>, Ingo Molnar <>,
	Peter Zijlstra <>
Subject: Re: [NOHZ] Remove scheduler_tick_max_deferment

On Sat, 1 Nov 2014, Thomas Gleixner wrote:

>  * balancing, etc... continue to move forward, even
>  * with a very low granularity.
> So this talks about the scheduler tick obviously, right?


> Now scheduler_tick() is invoked from update_process_times() and
> update_process_times() is invoked from tick_sched_handle() and that is
> invoked from either tick_sched_timer() or tick_nohz_handler().

> tick_sched_timer() is the hrtimer callback of tick_cpu_sched.sched_timer.
> That's used when high resolution timers are enabled.
> tick_nohz_handler() is the event handler for the clock event device if
> high resolution timers are disabled.
> Now the callsite of scheduler_tick_max_deferment() does:
>    time_delta = min(time_delta, scheduler_tick_max_deferment());
> And that is used further down after some other checks to arm either
> tick_cpu_sched.sched_timer or the clockevent itself.
> Which then when fired will invoke scheduler_tick() ....
> Really hard to figure out, right?

I thought there is already logic in there to compensate for times when the
tick is off.

tick_do_update_jiffies64 calculates the time differential and calculates
the number of ticks from there calling do_timer() with the number of ticks
that have passed since the last invocation. The global load calculation
is then also made based on the number of ticks that have passed. So it
compensates when reenabling. And the load during the dynticks busy period
is known because one process is monopolizing the processor during that

> I wont happen, if time_delta is KTIME_MAX and the following checks are
> not having a timer armed.
>                  if (unlikely(expires.tv64 == KTIME_MAX)) {
>                         if (ts->nohz_mode == NOHZ_MODE_HIGHRES)
>                                 hrtimer_cancel(&ts->sched_timer);
>                         goto out;
>                 }
> Which does either not arm the clockevent device (non highres) or
> cancels ts->sched_timer (highres).
> So in that case your timer interrupt will stop completely and therefor
> the scheduler updates on that cpu wont happen anymore.

Why is that bad? The load is constant and the timer interrupt can be
reenabled by the dynticks logic when a system call occurs that requires OS
services. I thought that was already done that way by Frederic?

> > Why does the scheduler require that tick? It seems that the processor is
> > always busy running exactly 1 process when the tick is not
> > occurring. Anything else will switch on the tick again. So the information
> > that the scheduler has never becomes outdated.
> Surely vruntime, load balancing data, load accounting and all the
> other stuff which contributes to global and local state updates itself
> magically.

There is logic in there that compensates when the tick is finally
reenabled. Load balancing data is already not updated when the tick is
disabled when the processor is idle right? What is so different here?

> As I said before: It can be delegated to a housekeeper, but this needs
> to be implemented first before we can remove that function.

We did not need to housekeeper in the dynticks idle case. What is so
different about dynticks busy?

> There is a world outside of vmstat kworker, really.

Absolutely but I thought the logic is already there to compensate for
issues like the timer interrupt not occurring.

I may not have the complete picture of the timer tick processing in my
mind these days (it has been a lots of years since I did any work there
after all) but as far as my arguably simplistic reading of the code goes I
do not see why a housekeeper would be needed there. The load is constant
and known in the dynticks busy case as it is in the dynticks idle case.

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
More majordomo info at
Please read the FAQ at

Powered by blists - more mailing lists