lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20230516122202.954313-1-alex@shruggie.ro>
Date:   Tue, 16 May 2023 15:22:02 +0300
From:   Alexandru Ardelean <alex@...uggie.ro>
To:     linux-kernel@...r.kernel.org
Cc:     mingo@...hat.com, peterz@...radead.org, juri.lelli@...hat.com,
        vincent.guittot@...aro.org, dietmar.eggemann@....com,
        rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
        bristot@...hat.com, vschneid@...hat.com, tian.xianting@....com,
        steffen.aschbacher@...hl.de, Alexandru Ardelean <alex@...uggie.ro>
Subject: [PATCH][V2] sched/rt: Print curr when RT throttling activated

From: Xianting Tian <tian.xianting@....com>

We may meet the issue, that one RT thread occupied the cpu by 950ms/1s,
The RT thread maybe is a business thread or other unknown thread.

Currently, it only outputs the print "sched: RT throttling activated"
when RT throttling happen. It is hard to know what is the RT thread,
For further analysis, we need add more prints.

This patch is to print current RT task when RT throttling activated,
It help us to know what is the RT thread in the first time.

Tested-by: Alexandru Ardelean <alex@...uggie.ro>
Signed-off-by: Xianting Tian <tian.xianting@....com>
---

Initial patch submission:
  https://lore.kernel.org/all/f3265adc26d4416dacf157f61fa60ad6@h3c.com/T/

We've been having some issues of our own with some applications + some RT
configuration == some threads endded up taking too much CPU time.
This patch came in quite in handy to see in logs which thread is more
problematic.

We've applied this patch onto a 5.10.116 tree. It did need a bit of
re-applying since some context has changed since the initial version (hence
the V2 tag).
Since 5.10.116 (where we used it), it applied nicely to the latest/current
linux-next tree (hence the Tested-by tag).

It would be nice to apply this to the mainline kernel and have this handy.

 kernel/sched/rt.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 00e0e5074115..44b161e42733 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -995,7 +995,7 @@ static inline int rt_se_prio(struct sched_rt_entity *rt_se)
 	return rt_task_of(rt_se)->prio;
 }
 
-static int sched_rt_runtime_exceeded(struct rt_rq *rt_rq)
+static int sched_rt_runtime_exceeded(struct rt_rq *rt_rq, struct task_struct *curr)
 {
 	u64 runtime = sched_rt_runtime(rt_rq);
 
@@ -1019,7 +1019,8 @@ static int sched_rt_runtime_exceeded(struct rt_rq *rt_rq)
 		 */
 		if (likely(rt_b->rt_runtime)) {
 			rt_rq->rt_throttled = 1;
-			printk_deferred_once("sched: RT throttling activated\n");
+			printk_deferred_once("sched: RT throttling activated (curr: pid %d, comm %s)\n",
+						curr->pid, curr->comm);
 		} else {
 			/*
 			 * In case we did anyway, make it go away,
@@ -1074,7 +1075,7 @@ static void update_curr_rt(struct rq *rq)
 		if (sched_rt_runtime(rt_rq) != RUNTIME_INF) {
 			raw_spin_lock(&rt_rq->rt_runtime_lock);
 			rt_rq->rt_time += delta_exec;
-			exceeded = sched_rt_runtime_exceeded(rt_rq);
+			exceeded = sched_rt_runtime_exceeded(rt_rq, curr);
 			if (exceeded)
 				resched_curr(rq);
 			raw_spin_unlock(&rt_rq->rt_runtime_lock);
-- 
2.40.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ