lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <D1KVETKKSRHL.18ZTVKAN8JS3Y@gmail.com>
Date: Tue, 28 May 2024 11:02:03 +1000
From: "Nicholas Piggin" <npiggin@...il.com>
To: "Oleg Nesterov" <oleg@...hat.com>
Cc: "Thomas Gleixner" <tglx@...utronix.de>, "Frederic Weisbecker"
 <frederic@...nel.org>, "Ingo Molnar" <mingo@...hat.com>, "Peter Zijlstra"
 <peterz@...radead.org>, "Phil Auld" <pauld@...hat.com>, "Chris von
 Recklinghausen" <crecklin@...hat.com>, <linux-kernel@...r.kernel.org>
Subject: Re: sched/isolation: tick_take_do_timer_from_boot() calls
 smp_call_function_single() with irqs disabled

On Tue May 28, 2024 at 1:57 AM AEST, Oleg Nesterov wrote:
> On 05/27, Nicholas Piggin wrote:
> >
> > On Mon May 27, 2024 at 5:27 AM AEST, Oleg Nesterov wrote:
> > > The more I grep the more I confused.
> > >
> > > On 05/25, Thomas Gleixner wrote:
> > > >
> > > > Right. It does not happen because the kernel starts with jiffies as
> > > > clocksource except on S390. The jiffies clocksource is not qualified to
> > > > switch over to NOHZ mode for obvious reasons.
> > >
> > > Not obvious for those who never looked at this code ;)
> > >
> > > OK, clocksource_jiffies doesn't have CLOCK_SOURCE_VALID_FOR_HRES,
> >
> > jiffies clocksource requires a timer to run it I suppose.
>
> I meant, this is enough to ensure that clocksource_done_booting() paths should
> find another cs, best != curr_clocksource, and call timekeeping_notify().
>
> Nevermind, quite possibly I misread this code.
>
> > > And I still don't understand why we can rely on can_stop_idle_tick() even
> > > in tick_nohz_idle_stop_tick().
> >
> > AFAIKS timer_expires_base would be 0 unless tick_nohz_next_event()
> > were called, but that is only called from places that checked
> > can_stop_idle_tick() or is already a tick_nohz_full() CPU.
>
> OK, thanks.
>
> So, Frederic, Nicholas, any objections to the trivial change below?

Since Thomas says it's alright, then no. I guess I added it because I
was not certain about taking the tick_do_timer_cpu while the boot CPU
could be running a timer interrupt.

I would take some of his comment to explain the race is harmless and
put it in that if block.

Out of curiosity, you are getting this going on x86? Or another arch?
Any particular use-case?

Thanks,
Nick

>
> We can cleanup the tick_do_timer_boot_cpu logic on top of it.
>
> Oleg.
>
> --- a/kernel/time/tick-common.c
> +++ b/kernel/time/tick-common.c
> @@ -178,26 +178,6 @@ void tick_setup_periodic(struct clock_event_device *dev, int broadcast)
>  	}
>  }
>  
> -#ifdef CONFIG_NO_HZ_FULL
> -static void giveup_do_timer(void *info)
> -{
> -	int cpu = *(unsigned int *)info;
> -
> -	WARN_ON(tick_do_timer_cpu != smp_processor_id());
> -
> -	tick_do_timer_cpu = cpu;
> -}
> -
> -static void tick_take_do_timer_from_boot(void)
> -{
> -	int cpu = smp_processor_id();
> -	int from = tick_do_timer_boot_cpu;
> -
> -	if (from >= 0 && from != cpu)
> -		smp_call_function_single(from, giveup_do_timer, &cpu, 1);
> -}
> -#endif
> -
>  /*
>   * Setup the tick device
>   */
> @@ -231,9 +211,8 @@ static void tick_setup_device(struct tick_device *td,
>  
>  		} else if (tick_do_timer_boot_cpu != -1 &&
>  						!tick_nohz_full_cpu(cpu)) {
> -			tick_take_do_timer_from_boot();
>  			tick_do_timer_boot_cpu = -1;
> -			WARN_ON(READ_ONCE(tick_do_timer_cpu) != cpu);
> +			WRITE_ONCE(tick_do_timer_cpu, cpu);
>  #endif
>  		}
>  


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ