linux-kernel - Re: [RFC PATCH 5/8] vhost: Add callback that stops new work and waits on running ones

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87ttw9fpsq.fsf@email.froward.int.ebiederm.org>
Date:   Thu, 18 May 2023 13:38:29 -0500
From:   "Eric W. Biederman" <ebiederm@...ssion.com>
To:     Mike Christie <michael.christie@...cle.com>
Cc:     Christian Brauner <brauner@...nel.org>, oleg@...hat.com,
        linux@...mhuis.info, nicolas.dichtel@...nd.com, axboe@...nel.dk,
        torvalds@...ux-foundation.org, linux-kernel@...r.kernel.org,
        virtualization@...ts.linux-foundation.org, mst@...hat.com,
        sgarzare@...hat.com, jasowang@...hat.com, stefanha@...hat.com
Subject: Re: [RFC PATCH 5/8] vhost: Add callback that stops new work and
 waits on running ones

Mike Christie <michael.christie@...cle.com> writes:

> On 5/18/23 9:18 AM, Christian Brauner wrote:
>>> @@ -352,12 +353,13 @@ static int vhost_worker(void *data)
>>>  		if (!node) {
>>>  			schedule();
>>>  			/*
>>> -			 * When we get a SIGKILL our release function will
>>> -			 * be called. That will stop new IOs from being queued
>>> -			 * and check for outstanding cmd responses. It will then
>>> -			 * call vhost_task_stop to exit us.
>>> +			 * When we get a SIGKILL we kick off a work to
>>> +			 * run the driver's helper to stop new work and
>>> +			 * handle completions. When they are done they will
>>> +			 * call vhost_task_stop to tell us to exit.
>>>  			 */
>>> -			vhost_task_get_signal();
>>> +			if (vhost_task_get_signal())
>>> +				schedule_work(&dev->destroy_worker);
>>>  		}
>> 
>> I'm pretty sure you still need to actually call exit here. Basically
>> mirror what's done in io_worker_exit() minus the io specific bits.
>
> We do call do_exit(). Once destory_worker has flushed the device and
> all outstanding IO has completed it call vhost_task_stop(). vhost_worker()
> above then breaks out of the loop and returns and vhost_task_fn() does
> do_exit().

I am not certain how you want to structure this but you really should
not call get_signal after it returns positive before you call do_exit.

You are in complete uncharted and untested waters calling get_signal
multiple times, when get_signal figures the proper response is to
call do_exit itself.

Eric