linux-kernel - Re: [RFC v3 1/3] kernel/time/clockevents: initial support for mono to raw time conversion

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Thu, 21 Jul 2016 11:08:27 -0700
From:	John Stultz <john.stultz@...aro.org>
To:	Nicolai Stange <nicstange@...il.com>
Cc:	Thomas Gleixner <tglx@...utronix.de>,
	lkml <linux-kernel@...r.kernel.org>
Subject: Re: [RFC v3 1/3] kernel/time/clockevents: initial support for mono to
 raw time conversion

On Wed, Jul 13, 2016 at 6:00 AM, Nicolai Stange <nicstange@...il.com> wrote:
> With NOHZ_FULL and one single well-isolated, CPU consumptive task, one
> would expect approximately one clockevent interrupt per second. However, on
> my Intel Haswell where the monotonic clock is the TSC monotonic clock and
> the clockevent device is the TSC deadline device, it turns out that every
> second, there are two such interrupts: the first one arrives always
> approximately ~50us before the scheduled deadline as programmed by
> tick_nohz_stop_sched_tick() through the hrtimer API. The
> __hrtimer_run_queues() called in this interrupt detects that the queued
> tick_sched_timer hasn't expired yet and simply does nothing except
> reprogramming the clock event device to fire shortly after again.
>
> These too early programmed deadlines are explained as follows:
> clockevents_program_event() programs the clockevent device to fire
> after
>   f_event * delta_t_progr
> clockevent device cycles where f_event is the clockevent device's hardware
> frequency and delta_t_progr is the requested time interval. After that many
> clockevent device cycles have elapsed, the device underlying the monotonic
> clock, that is the monotonic raw clock has seen f_raw / f_event as many
> cycles.
> The ktime_get() called from __hrtimer_run_queues() interprets those
> cycles to run at the frequency of the monotonic clock. Summarizing:
>   delta_t_perc = 1/f_mono * f_raw/f_event * f_event * delta_t_progr
>                = f_raw / f_mono * delta_t_progr
> with f_mono being the monotonic clock's frequency and delta_t_perc being
> the elapsed time interval as perceived by __hrtimer_run_queues().
>
> Now, f_mono is not a constant, but is dynamically adjusted in
> timekeeping_adjust() in order to compensate for the NTP error. With the
> large values of delta_t_progr of 10^9ns with NOHZ_FULL, the error made
> becomes significant and results in the double timer interrupts described
> above.
>
> Compensate for this error by multiplying the clockevent device's f_event
> by f_mono/f_raw.
>
> Namely:
> - Introduce a ->mult_mono member to the struct clock_event_device. It's
>   value is supposed to be equal to ->mult * f_mono/f_raw.
> - Introduce the timekeeping_get_mono_mult() helper which provides
>   the clockevent core with access to the timekeeping's current f_mono
>   and f_raw.
> - Introduce the helper __clockevents_adjust_freq() which
>   sets a clockevent device's ->mult_mono member as appropriate. It is
>   implemented with the help of the new __clockevents_calc_adjust_freq().
> - Call __clockevents_adjust_freq() at clockevent device registration time
>   as well as at frequency updates through clockevents_update_freq().
> - Finally, use the ->mult_mono rather than ->mult in the ns to cycle
>   conversion made in clockevents_program_event().
>
> Note that future adjustments of the monotonic clock are not taken into
> account yet. Furthemore, this patch assumes that after a clockevent
> device's registration, its ->mult changes only through calls to
> clockevents_update_freq().

Sorry for being a little slow to review here. Been swamped.

I was about to queue this but had a few nits that need addressing.


> Signed-off-by: Nicolai Stange <nicstange@...il.com>
> ---
>  include/linux/clockchips.h  |  1 +
>  kernel/time/clockevents.c   | 49 ++++++++++++++++++++++++++++++++++++++++++++-
>  kernel/time/tick-internal.h |  1 +
>  kernel/time/timekeeping.c   |  8 ++++++++
>  4 files changed, 58 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/clockchips.h b/include/linux/clockchips.h
> index 0d442e3..2863742 100644
> --- a/include/linux/clockchips.h
> +++ b/include/linux/clockchips.h
> @@ -104,6 +104,7 @@ struct clock_event_device {
>         u64                     max_delta_ns;
>         u64                     min_delta_ns;
>         u32                     mult;
> +       u32                     mult_mono;

So in this context(for me at least), mult and mult_mono are a bit
confusing.  I tend to think of it as mult and mult_raw, but in this
case mult is the "raw" unmodified value and mult_mono is the adjusted
one.

I'd probably suggest mult_adjusted or some other name to make it more
clear how it differs from the clockevent mult.

>
> +void timekeeping_get_mono_mult(u32 *mult_cs_mono, u32 *mult_cs_raw)
> +{
> +       struct tk_read_base *tkr_mono = &tk_core.timekeeper.tkr_mono;
> +
> +       *mult_cs_mono = tkr_mono->mult;
> +       *mult_cs_raw = tkr_mono->clock->mult;
> +}

So.. you probably should have some locking here. Or at least a big
comment making it clear why locking isn't necessary.

thanks
-john