lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=whXt9+-YfhgjBYxT9_ATjHbMDZ0yJdK7umrJGU8zBVZ9w@mail.gmail.com>
Date:   Mon, 12 Jun 2023 10:51:24 -0700
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Jens Axboe <axboe@...nel.dk>
Cc:     "Darrick J. Wong" <djwong@...nel.org>,
        Dave Chinner <david@...morbit.com>,
        Zorro Lang <zlang@...hat.com>, linux-xfs@...r.kernel.org,
        "Eric W. Biederman" <ebiederm@...ssion.com>,
        Mike Christie <michael.christie@...cle.com>,
        "Michael S. Tsirkin" <mst@...hat.com>, linux-kernel@...r.kernel.org
Subject: Re: [6.5-rc5 regression] core dump hangs (was Re: [Bug report]
 fstests generic/051 (on xfs) hang on latest linux v6.5-rc5+)

On Mon, Jun 12, 2023 at 10:29 AM Jens Axboe <axboe@...nel.dk> wrote:
>
> Looks fine to me to just kill it indeed, whatever we did need this
> for is definitely no longer the case. I _think_ we used to have
> something in the worker exit that would potentially sleep which
> is why we killed it before doing that, now it just looks like dead
> code.

Ok, can you (and the fsstress people) confirm that this
whitespace-damaged patch fixes the coredump issue:


  --- a/io_uring/io-wq.c
  +++ b/io_uring/io-wq.c
  @@ -221,9 +221,6 @@ static void io_worker_exit(..
        raw_spin_unlock(&wq->lock);
        io_wq_dec_running(worker);
        worker->flags = 0;
  -     preempt_disable();
  -     current->flags &= ~PF_IO_WORKER;
  -     preempt_enable();

        kfree_rcu(worker, rcu);
        io_worker_ref_put(wq);

Jens, I think that the two lines above there, ie the whole

        io_wq_dec_running(worker);
        worker->flags = 0;

thing may actually be the (partial?) reason for those PF_IO_WORKER
games. It's basically doing "now I'm doing stats by hand", and I
wonder if now it decrements the running worker one time too much?

Ie when the finally *dead* worker schedules away, never to return,
that's when that io_wq_worker_sleeping() case triggers and decrements
things one more time.

So there might be some bookkeeping reason for those games, but it
looks like if that's the case, then that

        worker->flags = 0;

will have already taken care of it.

I wonder if those two lines could just be removed too. But I think
that's separate from the "let's fix the core dumping" issue.

           Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ