[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250810154255.GA11928@redhat.com>
Date: Sun, 10 Aug 2025 17:42:55 +0200
From: Oleg Nesterov <oleg@...hat.com>
To: Christian Brauner <brauner@...nel.org>
Cc: 高翔 <gaoxiang17@...omi.com>,
Al Viro <viro@...iv.linux.org.uk>,
Xiang Gao <gxxa03070307@...il.com>,
"mjguzik@...il.com" <mjguzik@...il.com>,
"Liam.Howlett@...cle.com" <Liam.Howlett@...cle.com>,
"joel.granados@...nel.org" <joel.granados@...nel.org>,
"lorenzo.stoakes@...cle.com" <lorenzo.stoakes@...cle.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] pid: Add a judgment for ns null in pid_nr_ns
On 08/08, Christian Brauner wrote:
>
> On Tue, Aug 05, 2025 at 02:43:01PM +0200, Oleg Nesterov wrote:
> >
> > After the quick grep I don't see the problematic users, but if a zombie
> > task T does task_ppid_nr_ns(current, NULL) the kernel can crash:
> >
> > - pid_alive() succeeds, the task is not reaped yet
> >
> > - the parent/debugger does wait()->release_task(T), T->thread_pid
> > is NULL now
> >
> > - T calls task_tgid_nr_ns()-> ... pid_nr_ns(ns => NULL) because
> > task_active_pid_ns(T) returns NULL
> >
> > Do you think this worth fixing?
>
> If it's not too much work and it is an actual real-world concern then I
> think we should fix it. But I trust your judgement here!
Well, I don't really know what to do ;)
Initially I was going to do something like below to "fix" task_tgid_nr_ns()
--- a/include/linux/pid.h
+++ b/include/linux/pid.h
@@ -299,7 +299,9 @@ static inline pid_t task_ppid_nr_ns(const struct task_struct *tsk, struct pid_na
pid_t pid = 0;
rcu_read_lock();
- if (pid_alive(tsk))
+ if (!ns)
+ ns = task_active_pid_ns(current);
+ if (ns && pid_alive(tsk))
pid = task_tgid_nr_ns(rcu_dereference(tsk->real_parent), ns);
rcu_read_unlock();
and then add WARN_ON() into pid_nr_ns(). But note that we should not check 'ns'
before 'pid', we need
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -491,7 +491,13 @@ pid_t pid_nr_ns(struct pid *pid, struct pid_namespace *ns)
struct upid *upid;
pid_t nr = 0;
- if (pid && ns->level <= pid->level) {
+ if (!pid)
+ return nr;
+
+ if (WARN_ON_ONCE(!ns)
+ return nr;
+
+ if (ns->level <= pid->level) {
upid = &pid->numbers[ns->level];
if (upid->ns == ns)
nr = upid->nr;
Think of task_pid_vnr(current) from a zombie, it should return 0 without warning.
But this looks overcomplicated to me. So I am going to send the patch which simply
changes __task_pid_nr_ns()
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -514,7 +514,8 @@ pid_t __task_pid_nr_ns(struct task_struct *task, enum pid_type type,
rcu_read_lock();
if (!ns)
ns = task_active_pid_ns(current);
- nr = pid_nr_ns(rcu_dereference(*task_pid_ptr(task, type)), ns);
+ if (ns)
+ nr = pid_nr_ns(rcu_dereference(*task_pid_ptr(task, type)), ns);
rcu_read_unlock();
and leave pid_nr_ns() alone.
People will do mistakes, it is better not to crash and just return 0 if, say,
a zombie task does task_pid_vnr(another_task). Currently it is not simple to
do this correctly because, again, the pid_alive(current) check can't help.
pid_nr_ns() is more "low-level", its users should know what they are doing.
Although the quick grep suggests many of cleanups. Say, why kvm_create_pit()
does get_pid? why can't it simply use task_tgid_vnr(current). Even do_getpgid()
and sys_getsid() look overcomplicated. Later.
Oleg.
Powered by blists - more mailing lists