lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ZWhw_Q6T26e0pedm@alley>
Date:   Thu, 30 Nov 2023 12:24:45 +0100
From:   Petr Mladek <pmladek@...e.com>
To:     Kent Overstreet <kent.overstreet@...ux.dev>
Cc:     linux-kernel@...r.kernel.org, Nicholas Piggin <npiggin@...il.com>,
        Mike Christie <michael.christie@...cle.com>,
        Zqiang <qiang1.zhang@...el.com>,
        Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH] kthread: kthread_should_stop() checks if we're a kthread

On Tue 2023-11-28 12:40:12, Kent Overstreet wrote:
> On Tue, Nov 28, 2023 at 10:30:49AM +0100, Petr Mladek wrote:
> > Adding Andrew into Cc. He usually takes changes in kernel/kthread.c.
> > 
> > On Mon 2023-11-20 17:15:03, Kent Overstreet wrote:
> > > bcachefs has a fair amount of code that may or may not be running from a
> > > kthread (it may instead be called by a userspace ioctl); having
> > > kthread_should_stop() check if we're a kthread enables a fair bit of
> > > cleanup and makes it safer to use.
> > > 
> > > Signed-off-by: Kent Overstreet <kent.overstreet@...ux.dev>
> > > ---
> > >  kernel/kthread.c | 3 ++-
> > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/kernel/kthread.c b/kernel/kthread.c
> > > index 1eea53050bab..fe6090ddf414 100644
> > > --- a/kernel/kthread.c
> > > +++ b/kernel/kthread.c
> > > @@ -155,7 +155,8 @@ void free_kthread_struct(struct task_struct *k)
> > >   */
> > >  bool kthread_should_stop(void)
> > >  {
> > > -	return test_bit(KTHREAD_SHOULD_STOP, &to_kthread(current)->flags);
> > > +	return (current->flags & PF_KTHREAD) &&
> > > +		test_bit(KTHREAD_SHOULD_STOP, &to_kthread(current)->flags);
> > >  }
> > >  EXPORT_SYMBOL(kthread_should_stop);
> > 
> > I agree that it makes the API more safe because &to_kthread(current)
> > is NULL when the process is not a kthread.
> > 
> > Well, I do not like the idea of quietly ignoring a misuse of
> > the kthread_*() API. I would personally prefer to do:
> 
> It's only a misuse in the most pedantic sense, IMO.
> 
> Is it ever possible that with this change calling kthread_should_stop()
> from a non-kthread could cause a problem?

Of course, calling the updated kthread_should_stop() won't
cause problems in non-kthread context.

But it would make it more complicated to check for misuse
in the future.

Q: Why is a check for misuse important?
A: It partly depends on the personal opinion. But in general, it
   prevents other problems.

Q: Is it worth it in this case?
A: I do not have strong opinion.


I have basically two concerns:

1. Consistent behavior:

   For example, kthread_use_mm() and kthread_unuse_mm() already
   warn about a misuse.


2. Understanding the code

   Unconditional use of kthread_should_stop() makes the feeling
   that the code is intended for kthread context. People, not
   familiar with the code might miss that it might run also
   in userspace process.

   And indeed, I have looked how kthread_should_stop() is
   used in bcachefs code. For example, bch2_move_ratelimit()
   seems to handle only the kthread context:

int bch2_move_ratelimit(struct moving_context *ctxt)
{
[...]
	if (ctxt->wait_on_copygc && !c->copygc_running) {
		bch2_trans_unlock_long(ctxt->trans);
		wait_event_killable(c->copygc_running_wq,
				    !c->copygc_running ||
				    kthread_should_stop());
	}


    Yes, wait_event_killable() is primary for userspace
    process. AFAIK, signals are not delivered to kthreads
    by default, see sig_task_ignored().

    But the code does not check return value so that
    it does not detect when usespace process calling
    this function was killed.


	do {
		delay = ctxt->rate ? bch2_ratelimit_delay(ctxt->rate) : 0;
[...]
		if (delay) {
[...]
			set_current_state(TASK_INTERRUPTIBLE);
		}
[...]
		if ((current->flags & PF_KTHREAD) && kthread_should_stop()) {
			__set_current_state(TASK_RUNNING);
			return 1;
		}

		if (delay)
			schedule_timeout(delay);
[...]
		} while (delay);

   Here, the do-while cycle returns early when the kthread gets
   stopped. But it does not check whether there is a pending
   signal when called from userspace process.

   IMHO, it would be better to make it clear that the function might
   be called from both kthread and userspace processes. Otherwise,
   someone could later replace wait_event_killable() to wait_event()
   because kthreads do not react to signals by default.

   Also it would be worth to make it more clear what code is for
   the kthread context and what is for the userspace context.
   And the following would make it more obvious:

	if (in_kthread) {
		// handle kthread_should_stop
	} else {
		// handle signal_pending()
	}

   IMPORTANT: If the function could return early because of signal
	then it might make sense to make the kthread call path
	more robust. The kthread should not return before another
	process called kthread_stop().

	I vaguely remember that it might cause some problems.
	I am not sure how big problems. Either the kthread would stay in
	a limbo state because nobody would read the exit
	code. Or the eventual kthread_stop() caller could
	blow up or something like this.


BTW: kthread_should_stop() is called in wait_event_killable()
     without PF_KTHREAD check so that it could probably blow up.


My experience says that it is far from trivial to handle kthread_stop(),
freezer, and signals correctly. IMHO, doing some implicit
NOPs do not help to get the right picture.

BTW: try_to_freeze() might cause a deadlock when a kthread_stop()
     caller waits for the kthread() return but the kthread()
     is blocked in __refrigerator().

     At least this is my understanding of the commit 8a32c441c1609f80e
     ("freezer: implement and use kthread_freezable_should_stop()").

     Maybe, __refrigerator() should check kthread_should_stop()
     for all kthreads. But maybe, it might break something else.


> > // define this in include/linux/kthread.h
> > static inline bool in_kthread(void)
> > {
> > 	return current->flags & PF_KTHREAD
> > }
> > 
> > // add WARN() into kthread_should_stop()
> > bool kthread_should_stop(void)
> > {
> > 	if (WARN_ON_ONCE(!in_kthread))
> > 		return false;
> > 
> > 	return test_bit(KTHREAD_SHOULD_STOP, &to_kthread(current)->flags);
> > }
> > EXPORT_SYMBOL(kthread_should_stop);

I like this solution because the warning is printed even when
KTHREAD_SHOULD_STOP is not set. It means that misuse might
be caught easily.

Anyway, I am going to let Andrew to do a final decision.

Best Regards,
Petr

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ