lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Mon, 19 Apr 2021 22:52:14 +0200
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Marcelo Tosatti <mtosatti@...hat.com>
Cc:     Anna-Maria Behnsen <anna-maria@...utronix.de>,
        linux-kernel@...r.kernel.org,
        Frederic Weisbecker <frederic@...nel.org>,
        Peter Xu <peterx@...hat.com>,
        Nitesh Narayan Lal <nitesh@...hat.com>,
        Alex Belits <abelits@...vell.com>
Subject: Re: [PATCH v6] hrtimer: avoid retrigger_next_event IPI

On Mon, Apr 19 2021 at 16:39, Marcelo Tosatti wrote:
>  
> +static void clock_was_set_force_reprogram_work(struct work_struct *work)
> +{
> +	clock_was_set(true);
> +}
> +
> +static DECLARE_WORK(hrtimer_force_reprogram_work, clock_was_set_force_reprogram_work);
> +
> +
>  static void clock_was_set_work(struct work_struct *work)
>  {
> -	clock_was_set();
> +	clock_was_set(false);
>  }
>  
>  static DECLARE_WORK(hrtimer_work, clock_was_set_work);
> @@ -769,9 +777,12 @@ static DECLARE_WORK(hrtimer_work, clock_was_set_work);
>   * Called from timekeeping and resume code to reprogram the hrtimer
>   * interrupt device on all cpus.
>   */
> -void clock_was_set_delayed(void)
> +void clock_was_set_delayed(bool force_reprogram)
>  {
> -	schedule_work(&hrtimer_work);
> +	if (force_reprogram)
> +		schedule_work(&hrtimer_force_reprogram_work);
> +	else
> +		schedule_work(&hrtimer_work);
>  }
>  
>  #else
> @@ -871,6 +882,18 @@ static void hrtimer_reprogram(struct hrtimer *timer, bool reprogram)
>  	tick_program_event(expires, 1);
>  }
>  
> +#define CLOCK_SET_BASES ((1U << HRTIMER_BASE_REALTIME) |	\
> +			 (1U << HRTIMER_BASE_REALTIME_SOFT) |	\
> +			 (1U << HRTIMER_BASE_TAI) |		\
> +			 (1U << HRTIMER_BASE_TAI_SOFT) |	\
> +			 (1U << HRTIMER_BASE_BOOTTIME) |	\
> +			 (1U << HRTIMER_BASE_BOOTTIME_SOFT))
> +
> +static bool need_reprogram_timer(struct hrtimer_cpu_base *cpu_base)
> +{
> +	return (cpu_base->active_bases & CLOCK_SET_BASES) != 0;
> +}
> +
>  /*
>   * Clock realtime was set
>   *
> @@ -882,11 +905,42 @@ static void hrtimer_reprogram(struct hrtimer *timer, bool reprogram)
>   * resolution timer interrupts. On UP we just disable interrupts and
>   * call the high resolution interrupt code.
>   */
> -void clock_was_set(void)
> +void clock_was_set(bool force_reprogram)
>  {
>  #ifdef CONFIG_HIGH_RES_TIMERS
> -	/* Retrigger the CPU local events everywhere */
> -	on_each_cpu(retrigger_next_event, NULL, 1);
> +	cpumask_var_t mask;
> +	int cpu;
> +
> +	if (force_reprogram == true) {
> +		on_each_cpu(retrigger_next_event, NULL, 1);
> +		goto set_timerfd;
> +	}
> +
> +	if (!zalloc_cpumask_var(&mask, GFP_KERNEL)) {
> +		on_each_cpu(retrigger_next_event, NULL, 1);
> +		goto set_timerfd;
> +	}

This is really horrible and the proper thing to do is to seperate the
s2idle/resume related functionality which allows to do some other
cleanups as well.

I already started to look at that over the weekend, but got stuck due to
a couple of correctness issues I found with the whole clock_was_set()
mechanism while looking. That stuff needs to be fixed first.

Back to the above. For the s2idle resume path there is no reason for an
IPI at all. It's just that way because it has been bolted on the
existing machinery.

So today it does:

   tick_unfreeze() {
     if (first_cpu_to_unfreeze()) {
        timekeeping_resume();
          tick_resume();
            tick_resume_local();
          clock_was_set_delayed();
     } else {
        tick_resume_local();
     }
   }

Then after everything is unfrozen including workqueues the delayed
clock_was_set() runs and does the IPI dance. That's just silly.

We can be smarter than that:

   tick_unfreeze() {
     if (first_cpu_to_unfreeze())
        timekeeping_resume();
          tick_resume();
            tick_resume_local();
              hrtimer_resume_local();
     } else {
        tick_resume_local();
          hrtimer_resume_local();
     }
   }

See? No IPI required at all and no magic force argument and as a bonus
we get also understandable code.

Splitting this properly allows to be smarter about the affected clocks
because most of the clock_was_set() events only care about
CLOCK_REALTIME and CLOCK_TAI and just the sleep time injection affects
CLOCK_BOOTTIME as well.

There are a few other things which can be done to avoid even more IPIs,
but let me address the correctness issues of clock_was_set() first.

Thanks,

        tglx

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ