[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4CC569DB.17734.314BBE7A@pageexec.freemail.hu>
Date: Mon, 25 Oct 2010 13:28:27 +0200
From: pageexec@...email.hu
To: Andrew Morton <akpm@...ux-foundation.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
CC: LKML <linux-kernel@...r.kernel.org>, linux-mm <linux-mm@...ck.org>,
Solar Designer <solar@...nwall.com>,
Eugene Teo <eteo@...hat.com>,
Brad Spengler <spender@...ecurity.net>,
Oleg Nesterov <oleg@...hat.com>,
Roland McGrath <roland@...hat.com>
Subject: Re: [resend][PATCH 4/4] oom: don't ignore rss in nascent mm
On 25 Oct 2010 at 12:29, KOSAKI Motohiro wrote:
hi,
i've got a few comments/questions about the whole approach, see them inline.
> index 0644a15..a85b196 100644
> --- a/fs/compat.c
> +++ b/fs/compat.c
> @@ -1567,8 +1567,10 @@ int compat_do_execve(char * filename,
> return retval;
>
> out:
> - if (bprm->mm)
> + if (bprm->mm) {
> + set_exec_mm(NULL);
> mmput(bprm->mm);
> + }
>
> out_file:
> if (bprm->file) {
> diff --git a/fs/exec.c b/fs/exec.c
> index 94dabd2..2395d10 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -347,6 +347,8 @@ int bprm_mm_init(struct linux_binprm *bprm)
> if (err)
> goto err;
>
> + set_exec_mm(mm);
> +
> return 0;
>
> err:
> @@ -1416,8 +1428,10 @@ int do_execve(const char * filename,
> return retval;
>
> out:
> - if (bprm->mm)
> + if (bprm->mm) {
> + set_exec_mm(NULL);
> mmput (bprm->mm);
> + }
>
> out_file:
> if (bprm->file) {
what happens when two (or more) threads in the same process call execve? the
above set_exec_mm calls will race (de_thread doesn't happen until much later
in execve) and overwrite each other's ->in_exec_mm which will still lead to
problems since there will be at most one temporary mm accounted for in the
oom killer.
[update: since i don't seem to have been cc'd on the other patch that
serializes execve, the above point is moot ;)]
worse, even if each temporary mm was tracked separately there'd still be a
race where the oom killer can get triggered with the culprit thread long
gone (and reset ->in_exec_mm) and never to be found, so the oom killer would
find someone else as guilty.
now all this leads me to suggest a simpler solution, at least for the first
problem mentioned above (i don't know what to do with the second one yet as
it seems to be a generic issue with the oom killer, probably it should verify
the oom situation once again after it took the task_list lock).
[update: while the serialized execve solves the first problem, i still think
that my idea is simpler and worth considering, so i leave it here even if for
just documentation purposes ;)]
given that all the oom killer needs from the mm struct is either ->total_pages
(in .35 and before, so be careful with the stable backport) or some ->rss_stat
counters, wouldn't it be much easier to simply transfer the bprm->mm counters
into current->mm for the duration of the execve (say, add them in get_arg_page
and remove them when bprm->mm is mmput in the do_execve failure path, etc)? the
transfer can be either to the existing counters or to new ones (obviously in
the latter case the oom code needs a small change to take the new counters into
account as well).
cheers,
PaX Team
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists