linux-kernel - Re: [PATCH] pid: Add the judgment of whether ns is NULL in the find_pid

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20230726-tierisch-soweit-7bcefe6957dc@brauner>
Date:   Wed, 26 Jul 2023 11:25:24 +0200
From:   Christian Brauner <brauner@...nel.org>
To:     Xuewen Yan <xuewen.yan94@...il.com>
Cc:     Xuewen Yan <xuewen.yan@...soc.com>, jack@...e.cz,
        keescook@...omium.org, peterz@...radead.org,
        vincent.guittot@...aro.org, linux-kernel@...r.kernel.org,
        di.shen@...soc.com
Subject: Re: [PATCH] pid: Add the judgment of whether ns is NULL in the
 find_pid_ns

On Wed, Jul 26, 2023 at 09:23:13AM +0800, Xuewen Yan wrote:
> On Tue, Jul 25, 2023 at 8:47 PM Christian Brauner <brauner@...nel.org> wrote:
> >
> > On Tue, Jul 25, 2023 at 08:24:18PM +0800, Xuewen Yan wrote:
> > > On Tue, Jul 25, 2023 at 4:49 PM Christian Brauner <brauner@...nel.org> wrote:
> > > >
> > > > On Thu, Jul 13, 2023 at 03:17:13PM +0800, Xuewen Yan wrote:
> > > > > There is no the judgment of whether namspace is NULL in find_pid_ns.
> > > > > But there is a corner case when ns is null, for example: if user
> > > > > call find_get_pid when current is in exiting, the following stack would
> > > > > set thread_id be null:
> > > > > release_task
> > > > >     __exit_signal(p);
> > > > >         __unhash_process(tsk, group_dead);
> > > > >               detach_pid(p, PIDTYPE_PID);
> > > > >                   __change_pid(task, type, NULL);
> > > > >
> > > > > If user call find_get_pid at now, in find_vpid function, the
> > > >
> > > > I fail to see how this can happen. The code you're referencing is in
> > > > release_task(). If current has gone through that then current obviously
> > > > can't call find_vpid() on itself anymore or anything else for that
> > > > matter.
> > >
> > > This happened when user calls  find_vpid() in irq.
> > >
> > > [72117.635162] Call trace:
> > > [72117.635595]  idr_find+0xc/0x24
> > > [72117.636103]  find_get_pid+0x40/0x68
> > > [72117.636812]  send_event+0x88/0x180 [demux]
> > > [72117.637593]  vbvop_copy_data+0x150/0x344 [demux]
> > > [72117.638434]  dmisr_video_parsing_mpeg12+0x29c/0x42c [demux]
> > > [72117.639393]  dmisr_video_parsing_switch+0x68/0xec [demux]
> > > [72117.640332]  dmisr_handle_video_pes+0x10c/0x26c [demux]
> > > [72117.641108]  tasklet_action_common+0x130/0x224
> > > [72117.641784]  tasklet_action+0x28/0x34
> > > [72117.642366]  __do_softirq+0x128/0x4dc
> > > [72117.642944]  irq_exit+0xf8/0xfc
> > > [72117.643459]  __handle_domain_irq+0xb0/0x108
> > > [72117.644102]  gic_handle_irq+0x6c/0x124
> > > [72117.644691]  el1_irq+0x108/0x200
> > > [72117.645217]  _raw_write_unlock_irq+0x2c/0x5c
> > > [72117.645870]  release_task+0x144/0x1ac   <<<<<<
> > > [72117.646447]  do_exit+0x524/0x94c
> > > [72117.646970]  __do_sys_exit_group+0x0/0x14
> > > [72117.647591]  do_group_exit+0x0/0xa0
> > > [72117.648146]  __se_sys_exit+0x0/0x20
> > > [72117.648704]  el0_svc_common+0xcc/0x1bc
> > > [72117.649292]  el0_svc_handler+0x2c/0x3c
> > > [72117.649881]  el0_svc+0x8/0xc
> > >
> > > In release_task, write_unlock_irq(&tasklist_lock) will open irq, at
> > > this time, if user calls find_get_pid() in irq, because
> > > current->thread_id is NULL,
> > > it will handle the NULL pointer.
> >
> > Uhm, where is that code from? This doesn't seem to be upstream.
> 
> It's from our own platform, we found someone called  find_get_pid() in
> the module, and caused the crash.

So this is a bug report for an out of tree driver which I'm sure you're
aware we consider mostly irrelevant unless this is an upstream issue.

Please work around or better fix this in your out of tree driver or
please show a reproducer how this can happen on upstream kernels.

Otherwise I don't see why we'd care.