lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 18 Feb 2017 14:07:50 +0530
From:   Arun Raghavan <arun@...nraghavan.net>
To:     linux-kernel@...r.kernel.org
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        Arun Raghavan <arun@...nraghavan.net>
Subject: [RESEND 2] [PATCH] rlimits: Print more information when limits are exceeded

This dumps some information in logs when a process exceeds its CPU or RT
limits (soft and hard). Makes debugging easier when userspace triggers
these limits.

Signed-off-by: Arun Raghavan <arun@...nraghavan.net>
---
 kernel/time/posix-cpu-timers.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

Hello,
This has come up a couple of times in the past, but we haven't been able to
resolve whatever issues were pointed out.

In the mean time, we have frustrated users who don't know where they're getting
a SIGKILL from, and I'd really like to have a way for people to not have to go
through this.

The issues that came up the last time were:

 1. SIGXCPU messages shouldn't be needed since they can be caught: it's still
    useful to have the log because it isn't always possible to pin down the
    thread causing the problem in userspace.

 2. SIGKILL logging should be centralised: there seem to be multiple paths that
    trigger a SIGKILL -- and it seemed a bit ugly to try to add a reason
    parameter on all of them for the KILL case. Any other suggestions on how to
    deal with this?

I'm happy to fix this up to actually make it this time, but if there aren't
none, just pushing this out will make our lives a little less painful.

Thanks,
Arun

diff --git a/kernel/time/posix-cpu-timers.c b/kernel/time/posix-cpu-timers.c
index e9e8c10..6dbcf84 100644
--- a/kernel/time/posix-cpu-timers.c
+++ b/kernel/time/posix-cpu-timers.c
@@ -860,6 +860,9 @@ static void check_thread_timers(struct task_struct *tsk,
 			 * At the hard limit, we just die.
 			 * No need to calculate anything else now.
 			 */
+			printk(KERN_INFO
+				"CPU Watchdog Timeout (hard): %s[%d]\n",
+				tsk->comm, task_pid_nr(tsk));
 			__group_send_sig_info(SIGKILL, SEND_SIG_PRIV, tsk);
 			return;
 		}
@@ -872,7 +875,7 @@ static void check_thread_timers(struct task_struct *tsk,
 				sig->rlim[RLIMIT_RTTIME].rlim_cur = soft;
 			}
 			printk(KERN_INFO
-				"RT Watchdog Timeout: %s[%d]\n",
+				"RT Watchdog Timeout (soft): %s[%d]\n",
 				tsk->comm, task_pid_nr(tsk));
 			__group_send_sig_info(SIGXCPU, SEND_SIG_PRIV, tsk);
 		}
@@ -980,6 +983,9 @@ static void check_process_timers(struct task_struct *tsk,
 			 * At the hard limit, we just die.
 			 * No need to calculate anything else now.
 			 */
+			printk(KERN_INFO
+				"RT Watchdog Timeout (hard): %s[%d]\n",
+				tsk->comm, task_pid_nr(tsk));
 			__group_send_sig_info(SIGKILL, SEND_SIG_PRIV, tsk);
 			return;
 		}
@@ -987,6 +993,9 @@ static void check_process_timers(struct task_struct *tsk,
 			/*
 			 * At the soft limit, send a SIGXCPU every second.
 			 */
+			printk(KERN_INFO
+				"CPU Watchdog Timeout (soft): %s[%d]\n",
+				tsk->comm, task_pid_nr(tsk));
 			__group_send_sig_info(SIGXCPU, SEND_SIG_PRIV, tsk);
 			if (soft < hard) {
 				soft++;
-- 
2.9.3

Powered by blists - more mailing lists