lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 19 Dec 2022 12:33:22 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Frederic Weisbecker <frederic@...nel.org>
Cc:     Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>,
        rafael@...nel.org, daniel.lezcano@...aro.org,
        linux-pm@...r.kernel.org, linux-kernel@...r.kernel.org,
        len.brown@...el.com, Thomas Gleixner <tglx@...utronix.de>,
        Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Subject: Re: [RFC/RFT PATCH 1/2] sched/core: Check and schedule ksoftirq

On Fri, Dec 16, 2022 at 11:07:48PM +0100, Frederic Weisbecker wrote:
> On Fri, Dec 16, 2022 at 12:19:34PM +0100, Peter Zijlstra wrote:
> > On Thu, Dec 15, 2022 at 10:42:59AM -0800, Srinivas Pandruvada wrote:
> > > +		/* Give ksoftirqd 1 jiffy to get a chance to start its job */
> > > +		if (!READ_ONCE(it.done) && task_is_running(__this_cpu_read(ksoftirqd))) {
> > > +			__set_current_state(TASK_UNINTERRUPTIBLE);
> > > +			schedule_timeout(1);
> > > +		}
> > 
> > That's absolutely disgusting :-/
> 
> I know, and I hate checking task_is_running(__this_cpu_read(ksoftirqd))
> everywhere in idle. And in fact it doesn't work because some cpuidle drivers
> also do need_resched() checks.

quite a few indeed.

> I guess that either we assume that the idle injection is more important
> than serving softirqs and we shutdown the warnings accordingly, or we arrange
> for idle injection to have a lower prio than ksoftirqd.

ksoftirq is typically a CFS task while idle injection is required to be
a FIFO (typically FIFO-1) task -- so that would require lifting
ksoftirqd to FIFO and that has other problems.

I'm guessing the problem case is where idle injection comes in while
ksoftirq is running (and preempted), because in that case you cannot run
the softirq stuff in-place.

The 'right' thing to do would be to PI boost ksoftirqd in that case I
suppose. Perhaps something like so, it would boost ksoftirq when it's
running, and otherwise runs the ksoftirqd thing in-situ.

I've not really throught hard about this though, so perhaps I completely
wrecked things.


---
 include/linux/interrupt.h   |  3 +++
 kernel/sched/build_policy.c |  1 +
 kernel/sched/idle.c         |  4 ++++
 kernel/softirq.c            | 20 +++++++++++++++++---
 4 files changed, 25 insertions(+), 3 deletions(-)

diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index a92bce40b04b..a2db811d6947 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -606,6 +606,9 @@ extern void raise_softirq_irqoff(unsigned int nr);
 extern void raise_softirq(unsigned int nr);
 
 DECLARE_PER_CPU(struct task_struct *, ksoftirqd);
+DECLARE_PER_CPU(struct rt_mutex, ksoftirq_lock);
+
+extern bool __run_ksoftirqd(unsigned int cpu);
 
 static inline struct task_struct *this_cpu_ksoftirqd(void)
 {
diff --git a/kernel/sched/build_policy.c b/kernel/sched/build_policy.c
index d9dc9ab3773f..731c595e551c 100644
--- a/kernel/sched/build_policy.c
+++ b/kernel/sched/build_policy.c
@@ -28,6 +28,7 @@
 #include <linux/suspend.h>
 #include <linux/tsacct_kern.h>
 #include <linux/vtime.h>
+#include <linux/interrupt.h>
 
 #include <uapi/linux/sched/types.h>
 
diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
index f26ab2675f7d..093bfdfca2f1 100644
--- a/kernel/sched/idle.c
+++ b/kernel/sched/idle.c
@@ -370,6 +370,10 @@ void play_idle_precise(u64 duration_ns, u64 latency_ns)
 	WARN_ON_ONCE(!duration_ns);
 	WARN_ON_ONCE(current->mm);
 
+	rt_mutex_lock(&per_cpu(ksoftirq_lock, cpu));
+	__run_ksoftirqd(smp_processor_id());
+	rt_mutex_unlock(&per_cpu(ksoftirq_lock, cpu));
+
 	rcu_sleep_check();
 	preempt_disable();
 	current->flags |= PF_IDLE;
diff --git a/kernel/softirq.c b/kernel/softirq.c
index c8a6913c067d..a2cb600383a4 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -59,6 +59,7 @@ EXPORT_PER_CPU_SYMBOL(irq_stat);
 static struct softirq_action softirq_vec[NR_SOFTIRQS] __cacheline_aligned_in_smp;
 
 DEFINE_PER_CPU(struct task_struct *, ksoftirqd);
+DEFINE_PER_CPU(struct rt_mutex, ksoftirq_lock);
 
 const char * const softirq_to_name[NR_SOFTIRQS] = {
 	"HI", "TIMER", "NET_TX", "NET_RX", "BLOCK", "IRQ_POLL",
@@ -912,6 +913,7 @@ void __init softirq_init(void)
 			&per_cpu(tasklet_vec, cpu).head;
 		per_cpu(tasklet_hi_vec, cpu).tail =
 			&per_cpu(tasklet_hi_vec, cpu).head;
+		rt_mutex_init(&per_cpu(ksoftirq_mutex, cpu));
 	}
 
 	open_softirq(TASKLET_SOFTIRQ, tasklet_action);
@@ -923,7 +925,7 @@ static int ksoftirqd_should_run(unsigned int cpu)
 	return local_softirq_pending();
 }
 
-static void run_ksoftirqd(unsigned int cpu)
+bool __run_ksoftirqd(unsigned int cpu)
 {
 	ksoftirqd_run_begin();
 	if (local_softirq_pending()) {
@@ -933,10 +935,22 @@ static void run_ksoftirqd(unsigned int cpu)
 		 */
 		__do_softirq();
 		ksoftirqd_run_end();
-		cond_resched();
-		return;
+		return true;
 	}
 	ksoftirqd_run_end();
+	return false;
+}
+
+static void run_ksoftirqd(unsigned int cpu)
+{
+	bool run;
+
+	rt_mutex_lock(&per_cpu(ksoftirq_lock, cpu));
+	ran = __run_ksoftirqd(cpu);
+	rt_mutex_unlock(&per_cpu(ksoftirq_lock, cpu));
+
+	if (ran)
+		cond_resched();
 }
 
 #ifdef CONFIG_HOTPLUG_CPU

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ