linux-kernel - Re: [syzbot] [kvm?] [net?] [virt?] general protection fault in vhost_work

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <4rqrebfglyif4d7i4ufdnj2uqnubvljkeciqmelvotti5iu5ja@fryxznjicgn6>
Date:   Mon, 5 Jun 2023 10:26:54 +0200
From:   Stefano Garzarella <sgarzare@...hat.com>
To:     Mike Christie <michael.christie@...cle.com>
Cc:     "Michael S. Tsirkin" <mst@...hat.com>,
        syzbot <syzbot+d0d442c22fa8db45ff0e@...kaller.appspotmail.com>,
        jasowang@...hat.com, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
        syzkaller-bugs@...glegroups.com,
        virtualization@...ts.linux-foundation.org, stefanha@...hat.com
Subject: Re: [syzbot] [kvm?] [net?] [virt?] general protection fault in
 vhost_work_queue

On Thu, Jun 01, 2023 at 11:33:09AM -0500, Mike Christie wrote:
>On 6/1/23 2:47 AM, Stefano Garzarella wrote:
>>>
>>> static void vhost_worker_free(struct vhost_dev *dev)
>>> {
>>> -    struct vhost_worker *worker = dev->worker;
>>> +    struct vhost_task *vtsk = READ_ONCE(dev->worker.vtsk);
>>>
>>> -    if (!worker)
>>> +    if (!vtsk)
>>>         return;
>>>
>>> -    dev->worker = NULL;
>>> -    WARN_ON(!llist_empty(&worker->work_list));
>>> -    vhost_task_stop(worker->vtsk);
>>> -    kfree(worker);
>>> +    vhost_task_stop(vtsk);
>>> +    WARN_ON(!llist_empty(&dev->worker.work_list));
>>> +    WRITE_ONCE(dev->worker.vtsk, NULL);
>>
>> The patch LGTM, I just wonder if we should set dev->worker to zero here,
>
>We might want to just set kcov_handle to zero for now.
>
>In 6.3 and older, I think we could do:
>
>1. vhost_dev_set_owner could successfully set dev->worker.
>2. vhost_transport_send_pkt runs vhost_work_queue and sees worker
>is set and adds the vhost_work to the work_list.
>3. vhost_dev_set_owner fails in vhost_attach_cgroups, so we stop
>the worker before the work can be run and set worker to NULL.
>4. We clear kcov_handle and return.
>
>We leave the work on the work_list.
>
>5. Userspace can then retry vhost_dev_set_owner. If that works, then the
>work gets executed ok eventually.
>
>OR
>
>Userspace can just close the device. vhost_vsock_dev_release would
>eventually call vhost_dev_cleanup (vhost_dev_flush won't see a worker
>so will just return), and that will hit the WARN_ON but we would
>proceed ok.
>
>If I do a memset of the worker, then if userspace were to retry
>VHOST_SET_OWNER, we would lose the queued work since the work_list would
>get zero'd. I think it's unlikely this ever happens, but you know best
>so let me know if this a real issue.
>

I don't think it's a problem, though, you're right, we could hide the 
warning and thus future bugs, better as you proposed.

Thanks,
Stefano