[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100331185950.GB11635@redhat.com>
Date: Wed, 31 Mar 2010 20:59:50 +0200
From: Oleg Nesterov <oleg@...hat.com>
To: David Rientjes <rientjes@...gle.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
anfei <anfei.zhou@...il.com>,
KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
nishimura@....nes.nec.co.jp,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
Mel Gorman <mel@....ul.ie>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH -mm] proc: don't take ->siglock for /proc/pid/oom_adj
On 03/30, David Rientjes wrote:
>
> On Tue, 30 Mar 2010, Oleg Nesterov wrote:
>
> > ->siglock is no longer needed to access task->signal, change
> > oom_adjust_read() and oom_adjust_write() to read/write oom_adj
> > lockless.
> >
> > Yes, this means that "echo 2 >oom_adj" and "echo 1 >oom_adj"
> > can race and the second write can win, but I hope this is OK.
>
> Ok, but could you base this on -mm at
> http://userweb.kernel.org/~akpm/mmotm/ since an additional tunable has
> been added (oom_score_adj), which does the same thing?
David, I just can't understand why
oom-badness-heuristic-rewrite.patch
duplicates the related code in fs/proc/base.c and why it preserves
the deprecated signal->oom_adj.
OK. Please forget about lock_task_sighand/signal issues. Can't we kill
signal->oom_adj and create a single helper for both
/proc/pid/{oom_adj,oom_score_adj} ?
static ssize_t oom_any_adj_write(struct file *file, const char __user *buf,
size_t count, bool deprecated_mode)
{
struct task_struct *task;
char buffer[PROC_NUMBUF];
unsigned long flags;
long oom_score_adj;
int err;
memset(buffer, 0, sizeof(buffer));
if (count > sizeof(buffer) - 1)
count = sizeof(buffer) - 1;
if (copy_from_user(buffer, buf, count))
return -EFAULT;
err = strict_strtol(strstrip(buffer), 0, &oom_score_adj);
if (err)
return -EINVAL;
if (depraceted_mode) {
if (oom_score_adj == OOM_ADJUST_MAX)
oom_score_adj = OOM_SCORE_ADJ_MAX;
else
oom_score_adj = (oom_score_adj * OOM_SCORE_ADJ_MAX) /
-OOM_DISABLE;
}
if (oom_score_adj < OOM_SCORE_ADJ_MIN ||
oom_score_adj > OOM_SCORE_ADJ_MAX)
return -EINVAL;
task = get_proc_task(file->f_path.dentry->d_inode);
if (!task)
return -ESRCH;
if (!lock_task_sighand(task, &flags)) {
put_task_struct(task);
return -ESRCH;
}
if (oom_score_adj < task->signal->oom_score_adj &&
!capable(CAP_SYS_RESOURCE)) {
unlock_task_sighand(task, &flags);
put_task_struct(task);
return -EACCES;
}
task->signal->oom_score_adj = oom_score_adj;
unlock_task_sighand(task, &flags);
put_task_struct(task);
return count;
}
This is just the current oom_score_adj_read() + "if (depraceted_mode)"
which does oom_adj -> oom_score_adj conversion.
Now,
static ssize_t oom_adjust_write(...)
{
printk_once(KERN_WARNING "... deprecated ...\n");
return oom_any_adj_write(..., true);
}
static ssize_t oom_score_adj_write(...)
{
return oom_any_adj_write(..., false);
}
The same for oom_xxx_read().
What is the point to keep signal->oom_adj ?
Oleg.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists