lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sun, 4 Dec 2022 01:03:56 +0100
From:   Frederic Weisbecker <frederic@...nel.org>
To:     "Eric W. Biederman" <ebiederm@...ssion.com>
Cc:     "Paul E . McKenney" <paulmck@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Neeraj Upadhyay <quic_neeraju@...cinc.com>,
        Oleg Nesterov <oleg@...hat.com>,
        Pengfei Xu <pengfei.xu@...el.com>,
        Boqun Feng <boqun.feng@...il.com>,
        Lai Jiangshan <jiangshanlai@...il.com>, rcu@...r.kernel.org
Subject: Re: [PATCH 3/3] rcu-tasks: Fix synchronize_rcu_tasks() VS
 zap_pid_ns_processes()

On Fri, Dec 02, 2022 at 05:28:54PM -0600, Eric W. Biederman wrote:
> Frederic Weisbecker <frederic@...nel.org> writes:
> 
> > On Wed, Nov 30, 2022 at 12:37:15PM -0600, Eric W. Biederman wrote:
> >> Frederic Weisbecker <frederic@...nel.org> writes:
> >> Two questions.
> >> 
> >> 1) Is there any chance you need the exit_task_rcu_stop() and
> >>    exit_tasks_rcu_start() around schedule in the part of this code that
> >>    calls kernel_wait4.
> >
> > Indeed it could be relaxed there too if necessary.
> 
> I was just wondering how you tell if it is necessary and if perhaps
> the kernel_wait4 was overlooked.

As for the deadlock described in this patch, it involves waiting for
another task to reap yet another task. So it doesn't look necessary to
relax the lock when the current task does the reaping.


Now the following looks possible:

      TASK A                                    TASK B                       TASK C
      -----                                   -------                       ------
      mutex_lock(&lock1)
      synchronize_srcu(tasks_rcu_exit_srcu)
                                                                            mutex_lock(&lock1)
                                                                            //wait for TASK A to release &lock1
                                              exit_tasks_rcu_start();
                                              ...
                                              zap_pid_ns_processes()
                                                  kernel_wait4()
                                                      //wait for TASK C

> > So you mean it still reaps those that were EXIT_ZOMBIE before ignoring
> > SIGCHLD (the kernel_wait4() call) but it doesn't sleep anymore on those
> > that autoreap (or get reaped by a parent outside that namespace) after
> > ignoring SIGCHLD? Namely it doesn't do the schedule() loop I'm working
> > around here and proceeds with exit_notify() and notifies its parent?
> 
> Yes.  A change to make it work like when the thread group leader exits
> before the other threads.  So any actual sleeping happens in the reaper
> of the init process when the reaper calls wait(2).
> 
> > And then in this case the responsibility of sleeping, until the init_process
> > of the namespace is the last task in the namespace, goes to the parent while
> > waiting that init_process, right?
> >
> > But what if the init_process of the given namespace autoreaps? Should it then
> > wait itself until the namespace is empty? And then aren't we back to the initial
> > issue?
> 
> The idea is that we only care when the userspace cares.  I don't think
> there is any kernel code that fundamentally cares.  There might be a few
> bits that accidentally care and those would need to be take care of when
> making the proposed change.  The wait would only exist for userspace so
> the same semantics are observed.

I mean the issue is that if the pid_namespace reaper goes for autoreaping, then
it still has to go through that loop waiting for the remaining tasks of the
pid_namespace to be reaped. The loop will just happen later in exit_notify() but
eventually the issue remains: we'll have to relax tasks_rcu_exit_srcu
around that loop.

Thanks.

> 
> Eric

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ