lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230605123844.GA32275@redhat.com>
Date:   Mon, 5 Jun 2023 14:38:45 +0200
From:   Oleg Nesterov <oleg@...hat.com>
To:     "Eric W. Biederman" <ebiederm@...ssion.com>
Cc:     Mike Christie <michael.christie@...cle.com>, linux@...mhuis.info,
        nicolas.dichtel@...nd.com, axboe@...nel.dk,
        torvalds@...ux-foundation.org, linux-kernel@...r.kernel.org,
        virtualization@...ts.linux-foundation.org, mst@...hat.com,
        sgarzare@...hat.com, jasowang@...hat.com, stefanha@...hat.com,
        brauner@...nel.org
Subject: Re: [CFT][PATCH v3] fork, vhost: Use CLONE_THREAD to fix freezer/ps
 regression

On 06/02, Eric W. Biederman wrote:
>
>  static int vhost_task_fn(void *data)
>  {
>  	struct vhost_task *vtsk = data;
> -	int ret;
> +	bool dead = false;
> +
> +	for (;;) {
> +		bool did_work;
> +
> +		if (!dead && signal_pending(current)) {
> +			struct ksignal ksig;
> +			/*
> +			 * Calling get_signal can block in SIGSTOP,
> +			 * and the freezer.  Or it can clear
> +			 * fatal_signal_pending and return non-zero.
> +			 */
> +			dead = get_signal(&ksig);
> +			if (dead)
> +				set_bit(VHOST_TASK_FLAGS_STOP, &vtsk->flags);
> +		}
> +
> +		/* mb paired w/ kthread_stop */
> +		set_current_state(TASK_INTERRUPTIBLE);
> +
> +		did_work = vtsk->fn(vtsk->data);

I don't understand why do you set TASK_INTERRUPTIBLE before vtsk->fn(),
it seems that you could do this before the test_bit(FLAGS_STOP) below.
But probably I missed something and this is minor anyway...

> +		if (!did_work) {
> +			if (test_bit(VHOST_TASK_FLAGS_STOP, &vtsk->flags)) {
> +				__set_current_state(TASK_RUNNING);
> +				break;

What if VHOST_TASK_FLAGS_STOP was set by us after get_signal() above ?
We need to ensure that in this case vhost_work_queue() can't add a new work,
nobody will flush it.

In fact, unless I missed something this can even race with vhost_dev_flush().

	vhost_dev_flush:				vhost_task_fn:

	checks FLAGS_STOP, not set,
	vhost_task_flush() returns false
							gets SIGKILL, sets FLAGS_STOP

							vtsk->fn() returns false

							vhost_task_fn() exits.

	vhost_work_queue();
	wait_for_completion(&flush.wait_event);


and the last wait_for_completion() will hang forever.

Oleg.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ