lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZxeomPnsi6oGHKPT@localhost.localdomain>
Date: Tue, 22 Oct 2024 15:28:56 +0200
From: Frederic Weisbecker <frederic@...nel.org>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc: linux-kernel@...r.kernel.org, rcu@...r.kernel.org,
	"Paul E. McKenney" <paulmck@...nel.org>,
	Anna-Maria Behnsen <anna-maria@...utronix.de>,
	Davidlohr Bueso <dave@...olabs.net>, Ingo Molnar <mingo@...nel.org>,
	Josh Triplett <josh@...htriplett.org>,
	Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH 1/1] softirq: Use a dedicated thread for timer wakeups on
 PREEMPT_RT.

Le Fri, Oct 04, 2024 at 12:17:04PM +0200, Sebastian Andrzej Siewior a écrit :
> A timer/ hrtimer softirq is raised in-IRQ context. With threaded
> interrupts enabled or on PREEMPT_RT this leads to waking the ksoftirqd
> for the processing of the softirq.

It took me some time to understand the actual problem (yeah I know...)

Can this be rephrased as: "Timer and hrtimer softirq vectors are special
in that they are always raised in-IRQ context whereas other vectors are
more likely to be raised from threaded interrupts or any regular tasks
when threaded interrupts or PREEMPT_RT are enabled. This leads to
waking ksoftirqd for the processing of the softirqs whenever timer
vectors are involved.

>
> Once the ksoftirqd is marked as pending (or is running) it will collect
> all raised softirqs. This in turn means that a softirq which would have
> been processed at the end of the threaded interrupt, which runs at an
> elevated priority, is now moved to ksoftirqd which runs at SCHED_OTHER
> priority and competes with every regular task for CPU resources.

But for ksoftirqd to collect other non-timers softirqs, those vectors must
have been raised before from a threaded interrupt, right? So why those
vectors didn't get a chance to execute at the end of that threaded interrupt?

OTOH one problem I can imagine is a threaded interrupt preempting ksoftirqd
which must wait for ksoftirqd to complete due to the local_bh_disable()
in the beginning of irq_forced_thread_fn(). But then isn't there some
PI involved on the local lock?

Sorry I'm probably missing something...

>
> This introduces long delays on heavy loaded systems and is not desired
> especially if the system is not overloaded by the softirqs.
> 
> Split the TIMER_SOFTIRQ and HRTIMER_SOFTIRQ processing into a dedicated
> timers thread and let it run at the lowest SCHED_FIFO priority.
> Wake-ups for RT tasks happen from hardirq context so only timer_list timers
> and hrtimers for "regular" tasks are processed here.

That last sentence confuses me. How are timers for non regular task processed
elsewhere? Ah you mean RT tasks rely on hard hrtimers?

> The higher priority
> ensures that wakeups are performed before scheduling SCHED_OTHER tasks.
> 
> Using a dedicated variable to store the pending softirq bits values
> ensure that the timer are not accidentally picked up by ksoftirqd and
> other threaded interrupts.
> It shouldn't be picked up by ksoftirqd since it runs at lower priority.
> However if ksoftirqd is already running while a timer fires, then
> ksoftird will be PI-boosted due to the BH-lock to ktimer's priority.
> Ideally we try to avoid having ksoftirqd running.
> 
> The timer thread can pick up pending softirqs from ksoftirqd but only
> if the softirq load is high. It is not be desired that the picked up
> softirqs are processed at SCHED_FIFO priority under high softirq load
> but this can already happen by a PI-boost by a force-threaded interrupt.
> 
> [ frederic@...nel.org: rcutorture.c fixes, storm fix by introduction of
>   local_pending_timers() for tick_nohz_next_event() ]
> 
> [ junxiao.chang@...el.com: Ensure ktimersd gets woken up even if a
>   softirq is currently served. ]
> 
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
> ---
>  include/linux/interrupt.h | 29 ++++++++++++++
>  kernel/rcu/rcutorture.c   |  6 +++
>  kernel/softirq.c          | 82 ++++++++++++++++++++++++++++++++++++++-
>  kernel/time/hrtimer.c     |  4 +-
>  kernel/time/tick-sched.c  |  2 +-
>  kernel/time/timer.c       |  2 +-
>  6 files changed, 120 insertions(+), 5 deletions(-)
> 
> diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
> index 457151f9f263d..4a4f367cd6864 100644
> --- a/include/linux/interrupt.h
> +++ b/include/linux/interrupt.h
> @@ -616,6 +616,35 @@ extern void __raise_softirq_irqoff(unsigned int nr);
>  extern void raise_softirq_irqoff(unsigned int nr);
>  extern void raise_softirq(unsigned int nr);
>  
> +#ifdef CONFIG_PREEMPT_RT

This needs a comment section to explain why a dedicated
timers processing is needed.

> +DECLARE_PER_CPU(struct task_struct *, timersd);
> +DECLARE_PER_CPU(unsigned long, pending_timer_softirq);
> +
> +extern void raise_timer_softirq(void);
> +extern void raise_hrtimer_softirq(void);
> +
> +static inline unsigned int local_pending_timers(void)

Let's align with local_softirq_pending() naming and rather
have local_timers_pending() ?

> +{
> +	return __this_cpu_read(pending_timer_softirq);
> +}
> +
> +#ifdef CONFIG_PREEMPT_RT
> +static void timersd_setup(unsigned int cpu)
> +{

That also needs a comment.

> +	sched_set_fifo_low(current);
> +}
> +
> +static int timersd_should_run(unsigned int cpu)
> +{
> +	return local_pending_timers();
> +}
> +
> +static void run_timersd(unsigned int cpu)
> +{
> +	unsigned int timer_si;
> +
> +	ksoftirqd_run_begin();
> +
> +	timer_si = local_pending_timers();
> +	__this_cpu_write(pending_timer_softirq, 0);
> +	or_softirq_pending(timer_si);
> +
> +	__do_softirq();
> +
> +	ksoftirqd_run_end();
> +}
> +
> +static void raise_ktimers_thread(unsigned int nr)
> +{
> +	trace_softirq_raise(nr);
> +	__this_cpu_or(pending_timer_softirq, 1 << nr);
> +}
> +
> +void raise_hrtimer_softirq(void)
> +{
> +	raise_ktimers_thread(HRTIMER_SOFTIRQ);
> +}
> +
> +void raise_timer_softirq(void)
> +{
> +	unsigned long flags;
> +
> +	local_irq_save(flags);
> +	raise_ktimers_thread(TIMER_SOFTIRQ);
> +	wake_timersd();

This is supposed to be called from hardirq only, right?
Can't irq_exit_rcu() take care of it? Why is it different
from HRTIMER_SOFTIRQ ?

Thanks.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ