[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20060906222556.GA168@oleg>
Date: Thu, 7 Sep 2006 02:25:56 +0400
From: Oleg Nesterov <oleg@...sign.ru>
To: Jean Delvare <jdelvare@...e.de>
Cc: "Eric W. Biederman" <ebiederm@...ssion.com>,
Andrew Morton <akpm@...l.org>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
linux-kernel@...r.kernel.org, ak@...e.de
Subject: Re: [PATCH] proc: readdir race fix (take 3)
On 09/06, Jean Delvare wrote:
>
> On Wednesday 6 September 2006 11:01, Jean Delvare wrote:
> > Eric, Kame, thanks a lot for working on this. I'll be giving some good
> > testing to this patch today, and will return back to you when I'm done.
>
> The original issue is indeed fixed, but there's a problem with the patch.
> When stressing /proc (to verify the bug was fixed), my test machine ended
> up crashing. Here are the 2 traces I found in the logs:
>
> Sep 6 12:06:00 arrakis kernel: BUG: warning at
> kernel/fork.c:113/__put_task_struct()
> Sep 6 12:06:00 arrakis kernel: [<c0115f93>] __put_task_struct+0xf3/0x100
> Sep 6 12:06:00 arrakis kernel: [<c019666a>] proc_pid_readdir+0x13a/0x150
> Sep 6 12:06:00 arrakis kernel: [<c01745f0>] vfs_readdir+0x80/0xa0
> Sep 6 12:06:00 arrakis kernel: [<c0174750>] filldir+0x0/0xd0
> Sep 6 12:06:00 arrakis kernel: [<c017488c>] sys_getdents+0x6c/0xb0
> Sep 6 12:06:00 arrakis kernel: [<c0174750>] filldir+0x0/0xd0
> Sep 6 12:06:00 arrakis kernel: [<c0102fb7>] syscall_call+0x7/0xb
I think there is a bug in next_tgid(),
> -static struct task_struct *next_tgid(struct task_struct *start)
> -{
> - struct task_struct *pos;
> + task = NULL;
> rcu_read_lock();
> - pos = start;
> - if (pid_alive(start))
> - pos = next_task(start);
> - if (pid_alive(pos) && (pos != &init_task)) {
> - get_task_struct(pos);
> - goto done;
> +retry:
> + pid = find_ge_pid(tgid);
> + if (pid) {
> + tgid = pid->nr + 1;
> + task = pid_task(pid, PIDTYPE_PID);
> + /* What we to know is if the pid we have find is the
> + * pid of a thread_group_leader. Testing for task
> + * being a thread_group_leader is the obvious thing
> + * todo but there is a window when it fails, due to
> + * the pid transfer logic in de_thread.
> + *
> + * So we perform the straight forward test of seeing
> + * if the pid we have found is the pid of a thread
> + * group leader, and don't worry if the task we have
> + * found doesn't happen to be a thread group leader.
> + * As we don't care in the case of readdir.
> + */
> + if (!task || !has_group_leader_pid(task))
> + goto retry;
> + get_task_struct(task);
> }
> - pos = NULL;
> -done:
> rcu_read_unlock();
> - put_task_struct(start);
> - return pos;
> + return task;
> }
If the task found is not a group leader, we go to retry, but
the task != NULL.
Now, if find_ge_pid(tgid) returns NULL, we return that wrong
task, and it was not get_task_struct()'ed.
Oleg.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists