lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20091012202018.GH32088@hmsreliant.think-freely.org>
Date:	Mon, 12 Oct 2009 16:20:18 -0400
From:	Neil Horman <nhorman@...driver.com>
To:	linux-kernel@...r.kernel.org
Cc:	akpm@...ux-foundation.org, marcin.slusarz@...il.com,
	nhorman@...driver.com
Subject: Re: [PATCH 1/3] extend get/setrlimit to support setting rlimits
	external to a process (v6)

Augment /proc/<pid>/limits file to support limit setting

It was suggested to me recently that we support a mechanism by which we can set
various process limits from points external to the process.  The reasoning being
that some processes are very long lived, and it would be beneficial to these
long lived processes if we could modify their various limits without needing to
kill them, adjust the limits for the user and restarting them.  While individual
application can certainly export this control on their own, it would be nice if
such functionality were available to a sysadmin, without needing to have each
application re-invent the wheel.

As such, I've implemented the below patch, which makes /proc/pid/limits writable
for each process.  By writing the following format:
<limit> <current value> <max value>
to the limits file, an administrator can now dynamically change the limits for
the respective process.  Tested by myself with good results.

Signed-off-by: Neil Horman <nhorman@...driver.com>


 Documentation/filesystems/proc.txt |   26 +++++
 fs/proc/base.c                     |  184 ++++++++++++++++++++++++++++++-------
 include/linux/sched.h              |    3 
 kernel/sys.c                       |   48 ++++++---
 4 files changed, 209 insertions(+), 52 deletions(-)


diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index 2c48f94..62fd7f5 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -38,6 +38,7 @@ Table of Contents
   3.3	/proc/<pid>/io - Display the IO accounting fields
   3.4	/proc/<pid>/coredump_filter - Core dump filtering settings
   3.5	/proc/<pid>/mountinfo - Information about mounts
+  3.6	/proc/<pid>/limits - Information about process rlimit value
 
 
 ------------------------------------------------------------------------------
@@ -1408,3 +1409,28 @@ For more information on mount propagation see:
 
   Documentation/filesystems/sharedsubtree.txt
 
+3.6	/proc/<pid>/limits - Information about rlimit values
+------------------------------------------------------------
+
+This file contains information regarding the processes rlimit settings.
+Normally this information is only available programatically via the
+getrlimit/setrlimit syscalls.  This file exports it so that sysadmins may
+dyanmically see their values.  This file contains lines of the form:
+
+Limit     Set String     Soft Limit     Hard Limit     Units 
+
+Limit - A description of the limit
+Set String - A consise string defining the limit meaning
+Soft Limit - The rlim_cur value returned by getrlimit for the corresponding limit
+Hard Limit - The rlim_max value returned by getrlimit for the corresponding limit
+Units	   - The units that the given limit is measured in
+
+Limits for a given process can also be set by writing to this file by writing a
+string in the following format:
+<Set String> [value|"unlimited"] [value|"unlimited"] > proc/<pid>/limits
+
+For example to set the maximum core files size for process 2000 to a soft limit
+of 1024 bytes and a max limit of unlimited, we would do the following from a
+shell prompt:
+echo core 1024 unlimited > /proc/2000/limits
+
diff --git a/fs/proc/base.c b/fs/proc/base.c
index dd5bed0..69d2a55 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -49,6 +49,8 @@
 
 #include <asm/uaccess.h>
 
+#include <linux/string.h>
+#include <linux/ctype.h>
 #include <linux/errno.h>
 #include <linux/time.h>
 #include <linux/proc_fs.h>
@@ -456,72 +458,186 @@ static int proc_oom_score(struct task_struct *task, char *buffer)
 struct limit_names {
 	char *name;
 	char *unit;
+	char *match;
 };
 
 static const struct limit_names lnames[RLIM_NLIMITS] = {
-	[RLIMIT_CPU] = {"Max cpu time", "seconds"},
-	[RLIMIT_FSIZE] = {"Max file size", "bytes"},
-	[RLIMIT_DATA] = {"Max data size", "bytes"},
-	[RLIMIT_STACK] = {"Max stack size", "bytes"},
-	[RLIMIT_CORE] = {"Max core file size", "bytes"},
-	[RLIMIT_RSS] = {"Max resident set", "bytes"},
-	[RLIMIT_NPROC] = {"Max processes", "processes"},
-	[RLIMIT_NOFILE] = {"Max open files", "files"},
-	[RLIMIT_MEMLOCK] = {"Max locked memory", "bytes"},
-	[RLIMIT_AS] = {"Max address space", "bytes"},
-	[RLIMIT_LOCKS] = {"Max file locks", "locks"},
-	[RLIMIT_SIGPENDING] = {"Max pending signals", "signals"},
-	[RLIMIT_MSGQUEUE] = {"Max msgqueue size", "bytes"},
-	[RLIMIT_NICE] = {"Max nice priority", NULL},
-	[RLIMIT_RTPRIO] = {"Max realtime priority", NULL},
-	[RLIMIT_RTTIME] = {"Max realtime timeout", "us"},
+	[RLIMIT_CPU] = {"Max cpu time", "ms", "cpu"},
+	[RLIMIT_FSIZE] = {"Max file size", "bytes", "fsize"},
+	[RLIMIT_DATA] = {"Max data size", "bytes", "data"},
+	[RLIMIT_STACK] = {"Max stack size", "bytes", "stack"},
+	[RLIMIT_CORE] = {"Max core file size", "bytes", "core"},
+	[RLIMIT_RSS] = {"Max resident set", "bytes", "rss"},
+	[RLIMIT_NPROC] = {"Max processes", "processes", "nproc"},
+	[RLIMIT_NOFILE] = {"Max open files", "files", "nofile"},
+	[RLIMIT_MEMLOCK] = {"Max locked memory", "bytes", "memlock"},
+	[RLIMIT_AS] = {"Max address space", "bytes", "as"},
+	[RLIMIT_LOCKS] = {"Max file locks", "locks", "locks"},
+	[RLIMIT_SIGPENDING] = {"Max pending signals", "signals", "sigpending"},
+	[RLIMIT_MSGQUEUE] = {"Max msgqueue size", "bytes", "msgqueue"},
+	[RLIMIT_NICE] = {"Max nice priority", NULL, "nice"},
+	[RLIMIT_RTPRIO] = {"Max realtime priority", NULL, "rtprio"},
+	[RLIMIT_RTTIME] = {"Max realtime timeout", "us", "rttime"},
 };
 
 /* Display limits for a process */
-static int proc_pid_limits(struct task_struct *task, char *buffer)
+static ssize_t proc_pid_limit_read(struct file *file, char __user *buf,
+		size_t count, loff_t *ppos)
 {
 	unsigned int i;
-	int count = 0;
 	unsigned long flags;
-	char *bufptr = buffer;
+	char *bufptr;
+	size_t bcount = 0;
+	size_t ccount = -ENOMEM;
+	struct task_struct *task = get_proc_task(file->f_path.dentry->d_inode);
 
 	struct rlimit rlim[RLIM_NLIMITS];
 
+	bufptr = kzalloc((RLIM_NLIMITS+1)*90, GFP_KERNEL);
+	if (!bufptr)
+		goto out;
+
+	ccount = -EBUSY;
+
 	if (!lock_task_sighand(task, &flags))
-		return 0;
+		goto out_free;
+
 	memcpy(rlim, task->signal->rlim, sizeof(struct rlimit) * RLIM_NLIMITS);
 	unlock_task_sighand(task, &flags);
 
 	/*
 	 * print the file header
 	 */
-	count += sprintf(&bufptr[count], "%-25s %-20s %-20s %-10s\n",
-			"Limit", "Soft Limit", "Hard Limit", "Units");
+	bcount += sprintf(&bufptr[bcount], "%-25s %-12s %-20s %-20s %-10s\n",
+			"Limit", "Set String", "Soft Limit", "Hard Limit", "Units");
 
 	for (i = 0; i < RLIM_NLIMITS; i++) {
 		if (rlim[i].rlim_cur == RLIM_INFINITY)
-			count += sprintf(&bufptr[count], "%-25s %-20s ",
-					 lnames[i].name, "unlimited");
+			bcount += sprintf(&bufptr[bcount], "%-25s %-12s %-20s ",
+					lnames[i].name ,lnames[i].match,
+					"unlimited");
 		else
-			count += sprintf(&bufptr[count], "%-25s %-20lu ",
-					 lnames[i].name, rlim[i].rlim_cur);
-
+			bcount += sprintf(&bufptr[bcount], "%-25s %-12s %-20lu ",
+					lnames[i].name, lnames[i].match,
+					rlim[i].rlim_cur);
 		if (rlim[i].rlim_max == RLIM_INFINITY)
-			count += sprintf(&bufptr[count], "%-20s ", "unlimited");
+			bcount += sprintf(&bufptr[bcount], "%-20s ",
+					"unlimited");
 		else
-			count += sprintf(&bufptr[count], "%-20lu ",
+			bcount += sprintf(&bufptr[bcount], "%-20lu ",
 					 rlim[i].rlim_max);
-
 		if (lnames[i].unit)
-			count += sprintf(&bufptr[count], "%-10s\n",
+			bcount += sprintf(&bufptr[bcount], "%-10s\n",
 					 lnames[i].unit);
 		else
-			count += sprintf(&bufptr[count], "\n");
+			bcount += sprintf(&bufptr[bcount], "\n");
+	}
+
+	ccount = -EMSGSIZE;
+
+	if (*ppos >= bcount)
+		goto out_task;
+ 
+	ccount = min(count, (size_t)(bcount-(*ppos)));
+	ccount = ccount - copy_to_user(buf, &bufptr[*ppos], ccount);
+	*ppos += ccount;
+out_task:
+	put_task_struct(task);
+out_free:
+	kfree(bufptr);
+out:
+	return ccount;
+}
+
+#define PROC_PID_BUF_SZ 128
+static ssize_t proc_pid_limit_write(struct file *file, const char __user *buf,
+		size_t count, loff_t *ppos)
+{
+	char *buffer;
+	char *element, *vmc, *vmm;
+	struct rlimit new_rlim;
+	unsigned long flags;
+	int i;
+	int index = -1;
+	size_t wcount = -EMSGSIZE;
+	struct task_struct *task = get_proc_task(file->f_path.dentry->d_inode);
+ 
+ 
+	if (*ppos != 0)
+		goto out;
+ 
+	if (count > PROC_PID_BUF_SZ)
+		goto out;
+
+	wcount = -ENOMEM;
+	buffer = kzalloc(PROC_PID_BUF_SZ, GFP_KERNEL);
+ 
+	if (!buffer)
+		goto out;
+ 
+	element = kzalloc(PROC_PID_BUF_SZ, GFP_KERNEL);
+	vmc = kzalloc(PROC_PID_BUF_SZ, GFP_KERNEL);
+	vmm = kzalloc(PROC_PID_BUF_SZ, GFP_KERNEL);
+ 
+	if (!element || !vmm || !vmc)
+		goto out_free;
+
+	wcount = -EFAULT;
+ 
+	if (copy_from_user(buffer, buf, count))
+		goto out_free;
+ 
+	i = sscanf(buffer, "%s %s %s", element, vmc, vmm);
+ 
+	if (i < 3)
+		goto out_free;
+ 
+	if (!strncmp(vmc, "unlimited", 9))
+		new_rlim.rlim_cur = RLIM_INFINITY;
+	else
+		new_rlim.rlim_cur = simple_strtoul(vmc, NULL, 10);
+ 
+	if (!strncmp(vmm, "unlimited", 9))
+		new_rlim.rlim_max = RLIM_INFINITY;
+	else
+		new_rlim.rlim_max = simple_strtoul(vmm, NULL, 10);
+ 
+	for (i = 0; i < RLIM_NLIMITS; i++) {
+		if (!strncmp(element, lnames[i].match,
+		     strlen(lnames[i].match))) {
+			index = i;
+			break;
+		}
 	}
 
+	wcount = -EBUSY;
+ 
+	if (!lock_task_sighand(task, &flags))
+		goto out_free;
+
+	wcount = -ENOENT;
+ 
+	if ((index >= 0) && (index < RLIM_NLIMITS))
+		wcount = do_setrlimit(index, &new_rlim, task);
+ 
+	unlock_task_sighand(task, &flags);
+ 
+out_free:
+	kfree(element);
+	kfree(vmc);
+	kfree(vmm);
+	kfree(buffer);
+out:
+	*ppos += count;
+	put_task_struct(task);
 	return count;
 }
 
+static const struct file_operations proc_limit_operations = {
+	.read           = proc_pid_limit_read,
+	.write          = proc_pid_limit_write,
+};
+
 #ifdef CONFIG_HAVE_ARCH_TRACEHOOK
 static int proc_pid_syscall(struct task_struct *task, char *buffer)
 {
@@ -2501,7 +2617,7 @@ static const struct pid_entry tgid_base_stuff[] = {
 	INF("auxv",       S_IRUSR, proc_pid_auxv),
 	ONE("status",     S_IRUGO, proc_pid_status),
 	ONE("personality", S_IRUSR, proc_pid_personality),
-	INF("limits",	  S_IRUSR, proc_pid_limits),
+	REG("limits",	  S_IRUSR|S_IWUSR, proc_limit_operations),
 #ifdef CONFIG_SCHED_DEBUG
 	REG("sched",      S_IRUGO|S_IWUSR, proc_pid_sched_operations),
 #endif
@@ -2836,7 +2952,7 @@ static const struct pid_entry tid_base_stuff[] = {
 	INF("auxv",      S_IRUSR, proc_pid_auxv),
 	ONE("status",    S_IRUGO, proc_pid_status),
 	ONE("personality", S_IRUSR, proc_pid_personality),
-	INF("limits",	 S_IRUSR, proc_pid_limits),
+	REG("limits",	 S_IRUSR|S_IWUSR, proc_limit_operations),
 #ifdef CONFIG_SCHED_DEBUG
 	REG("sched",     S_IRUGO|S_IWUSR, proc_pid_sched_operations),
 #endif
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 2be3760..be54f28 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -672,6 +672,9 @@ struct signal_struct {
 	int oom_adj;	/* OOM kill score adjustment (bit shift) */
 };
 
+extern int do_setrlimit(unsigned int resource, struct rlimit *new_rlim,
+			struct task_struct *tsk);
+
 /* Context switch must be unlocked if interrupts are to be enabled */
 #ifdef __ARCH_WANT_INTERRUPTS_ON_CTXSW
 # define __ARCH_WANT_UNLOCKED_CTXSW
diff --git a/kernel/sys.c b/kernel/sys.c
index 1828f8d..0e210a4 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -1238,41 +1238,41 @@ SYSCALL_DEFINE2(old_getrlimit, unsigned int, resource,
 
 #endif
 
-SYSCALL_DEFINE2(setrlimit, unsigned int, resource, struct rlimit __user *, rlim)
+int do_setrlimit(unsigned int resource, struct rlimit *new_rlim,
+		 struct task_struct *tsk)
 {
-	struct rlimit new_rlim, *old_rlim;
 	int retval;
+	struct rlimit *old_rlim;
 
-	if (resource >= RLIM_NLIMITS)
-		return -EINVAL;
-	if (copy_from_user(&new_rlim, rlim, sizeof(*rlim)))
-		return -EFAULT;
-	if (new_rlim.rlim_cur > new_rlim.rlim_max)
+
+	if (new_rlim->rlim_cur > new_rlim->rlim_max)
 		return -EINVAL;
-	old_rlim = current->signal->rlim + resource;
-	if ((new_rlim.rlim_max > old_rlim->rlim_max) &&
+	old_rlim = tsk->signal->rlim + resource;
+
+	if ((new_rlim->rlim_max > old_rlim->rlim_max) &&
 	    !capable(CAP_SYS_RESOURCE))
 		return -EPERM;
-	if (resource == RLIMIT_NOFILE && new_rlim.rlim_max > sysctl_nr_open)
+
+	if (resource == RLIMIT_NOFILE && new_rlim->rlim_max > sysctl_nr_open)
 		return -EPERM;
 
-	retval = security_task_setrlimit(resource, &new_rlim);
+	retval = security_task_setrlimit(resource, new_rlim);
 	if (retval)
 		return retval;
 
-	if (resource == RLIMIT_CPU && new_rlim.rlim_cur == 0) {
+	if (resource == RLIMIT_CPU && new_rlim->rlim_cur == 0) {
 		/*
 		 * The caller is asking for an immediate RLIMIT_CPU
 		 * expiry.  But we use the zero value to mean "it was
 		 * never set".  So let's cheat and make it one second
 		 * instead
 		 */
-		new_rlim.rlim_cur = 1;
+		new_rlim->rlim_cur = 1;
 	}
 
-	task_lock(current->group_leader);
-	*old_rlim = new_rlim;
-	task_unlock(current->group_leader);
+	task_lock(tsk->group_leader);
+	*old_rlim = *new_rlim;
+	task_unlock(tsk->group_leader);
 
 	if (resource != RLIMIT_CPU)
 		goto out;
@@ -1283,14 +1283,26 @@ SYSCALL_DEFINE2(setrlimit, unsigned int, resource, struct rlimit __user *, rlim)
 	 * very long-standing error, and fixing it now risks breakage of
 	 * applications, so we live with it
 	 */
-	if (new_rlim.rlim_cur == RLIM_INFINITY)
+	if (new_rlim->rlim_cur == RLIM_INFINITY)
 		goto out;
 
-	update_rlimit_cpu(new_rlim.rlim_cur);
+	update_rlimit_cpu(new_rlim->rlim_cur);
 out:
 	return 0;
 }
 
+SYSCALL_DEFINE2(setrlimit, unsigned int, resource, struct rlimit __user *, rlim)
+{
+	struct rlimit new_rlim;
+
+	if (resource >= RLIM_NLIMITS)
+		return -EINVAL;
+	if (copy_from_user(&new_rlim, rlim, sizeof(*rlim)))
+		return -EFAULT;
+
+	return do_setrlimit(resource, &new_rlim, current);
+}
+
 /*
  * It would make sense to put struct rusage in the task_struct,
  * except that would make the task_struct be *really big*.  After
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ