netdev - Re: [PATCH 1/1] vhost: Provide a kernel warning if mutex is held whilst clean-up in progress

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20220304075039.rewrf3gnbbh3sdfl@sgarzare-redhat>
Date:   Fri, 4 Mar 2022 08:50:39 +0100
From:   Stefano Garzarella <sgarzare@...hat.com>
To:     "Michael S. Tsirkin" <mst@...hat.com>
Cc:     Leon Romanovsky <leon@...nel.org>,
        Lee Jones <lee.jones@...aro.org>, jasowang@...hat.com,
        linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
        virtualization@...ts.linux-foundation.org, netdev@...r.kernel.org
Subject: Re: [PATCH 1/1] vhost: Provide a kernel warning if mutex is held
 whilst clean-up in progress

On Thu, Mar 03, 2022 at 04:01:06PM -0500, Michael S. Tsirkin wrote:
>On Thu, Mar 03, 2022 at 09:14:36PM +0200, Leon Romanovsky wrote:
>> On Thu, Mar 03, 2022 at 03:19:29PM +0000, Lee Jones wrote:
>> > All workers/users should be halted before any clean-up should take place.
>> >
>> > Suggested-by:  Michael S. Tsirkin <mst@...hat.com>
>> > Signed-off-by: Lee Jones <lee.jones@...aro.org>
>> > ---
>> >  drivers/vhost/vhost.c | 3 +++
>> >  1 file changed, 3 insertions(+)
>> >
>> > diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
>> > index bbaff6a5e21b8..d935d2506963f 100644
>> > --- a/drivers/vhost/vhost.c
>> > +++ b/drivers/vhost/vhost.c
>> > @@ -693,6 +693,9 @@ void vhost_dev_cleanup(struct vhost_dev *dev)
>> >  	int i;
>> >
>> >  	for (i = 0; i < dev->nvqs; ++i) {
>> > +		/* Ideally all workers should be stopped prior to clean-up */
>> > +		WARN_ON(mutex_is_locked(&dev->vqs[i]->mutex));
>> > +
>> >  		mutex_lock(&dev->vqs[i]->mutex);
>>
>> I know nothing about vhost, but this construction and patch looks
>> strange to me.
>>
>> If all workers were stopped, you won't need mutex_lock(). The mutex_lock
>> here suggests to me that workers can still run here.
>>
>> Thanks
>
>
>"Ideally" here is misleading, we need a bigger detailed comment
>along the lines of:
>
>/*
> * By design, no workers can run here. But if there's a bug and the
> * driver did not flush all work properly then they might, and we
> * encountered such bugs in the past.  With no proper flush guest won't
> * work correctly but avoiding host memory corruption in this case
> * sounds like a good idea.
> */

Can we use vhost_vq_get_backend() to check this situation?

IIUC all the vhost devices clear the backend to stop the workers.
This is not racy (if we do after the mutex_lock) and should cover all 
cases.

Thanks,
Stefano