lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d7530c13-f1a1-311e-7d5e-8e65f3bc2e50@redhat.com>
Date:   Thu, 20 Apr 2023 16:16:08 +0200
From:   Maxime Coquelin <maxime.coquelin@...hat.com>
To:     Jason Wang <jasowang@...hat.com>
Cc:     xieyongji@...edance.com, mst@...hat.com, david.marchand@...hat.com,
        linux-kernel@...r.kernel.org,
        virtualization@...ts.linux-foundation.org, netdev@...r.kernel.org,
        xuanzhuo@...ux.alibaba.com, eperezma@...hat.com,
        Peter Xu <peterx@...hat.com>
Subject: Re: [RFC 0/2] vduse: add support for networking devices



On 4/20/23 06:34, Jason Wang wrote:
> On Wed, Apr 19, 2023 at 9:43 PM Maxime Coquelin
> <maxime.coquelin@...hat.com> wrote:
>>
>> This small series enables virtio-net device type in VDUSE.
>> With it, basic operation have been tested, both with
>> virtio-vdpa and vhost-vdpa using DPDK Vhost library series
>> adding VDUSE support [0] using split rings layout.
>>
>> Control queue support (and so multiqueue) has also been
>> tested, but require a Kernel series from Jason Wang
>> relaxing control queue polling [1] to function reliably.
>>
>> Other than that, we have identified a few gaps:
>>
>> 1. Reconnection:
>>   a. VDUSE_VQ_GET_INFO ioctl() returns always 0 for avail
>>      index, even after the virtqueue has already been
>>      processed. Is that expected? I have tried instead to
>>      get the driver's avail index directly from the avail
>>      ring, but it does not seem reliable as I sometimes get
>>      "id %u is not a head!\n" warnings. Also such solution
>>      would not be possible with packed ring, as we need to
>>      know the wrap counters values.
> 
> Looking at the codes, it only returns the value that is set via
> set_vq_state(). I think it is expected to be called before the
> datapath runs.
> 
> So when bound to virtio-vdpa, it is expected to return 0. But we need
> to fix the packed virtqueue case, I wonder if we need to call
> set_vq_state() explicitly in virtio-vdpa before starting the device.
> 
> When bound to vhost-vdpa, Qemu will call VHOST_SET_VRING_BASE which
> will end up a call to set_vq_state(). Unfortunately, it doesn't
> support packed ring which needs some extension.
> 
>>
>>   b. Missing IOCTLs: it would be handy to have new IOCTLs to
>>      query Virtio device status,
> 
> What's the use case of this ioctl? It looks to me userspace is
> notified on each status change now:
> 
> static int vduse_dev_set_status(struct vduse_dev *dev, u8 status)
> {
>          struct vduse_dev_msg msg = { 0 };
> 
>          msg.req.type = VDUSE_SET_STATUS;
>          msg.req.s.status = status;
> 
>          return vduse_dev_msg_sync(dev, &msg);
> }

The idea was to be able to query the status at reconnect time, and
neither having to assume its value nor having to store its value in a
file (the status could change while the VDUSE application is stopped,
but maybe it would receive the notification at reconnect).

I will prototype using a tmpfs file to save needed information, and see
if it works.

>> and retrieve the config
>>      space set at VDUSE_CREATE_DEV time.
> 
> In order to be safe, VDUSE avoids writable config space. Otherwise
> drivers could block on config writing forever. That's why we don't do
> it now.

The idea was not to make the config space writable, but just to be able
to fetch what was filled at VDUSE_CREATE_DEV time.

With the tmpfs file, we can avoid doing that and just save the config
space there.

> We need to harden the config write before we can proceed to this I think.
> 
>>
>> 2. VDUSE application as non-root:
>>    We need to run the VDUSE application as non-root. There
>>    is some race between the time the UDEV rule is applied
>>    and the time the device starts being used. Discussing
>>    with Jason, he suggested we may have a VDUSE daemon run
>>    as root that would create the VDUSE device, manages its
>>    rights and then pass its file descriptor to the VDUSE
>>    app. However, with current IOCTLs, it means the VDUSE
>>    daemon would need to know several information that
>>    belongs to the VDUSE app implementing the device such
>>    as supported Virtio features, config space, etc...
>>    If we go that route, maybe we should have a control
>>    IOCTL to create the device which would just pass the
>>    device type. Then another device IOCTL to perform the
>>    initialization. Would that make sense?
> 
> I think so. We can hear from others.
> 
>>
>> 3. Coredump:
>>    In order to be able to perform post-mortem analysis, DPDK
>>    Vhost library marks pages used for vrings and descriptors
>>    buffers as MADV_DODUMP using madvise(). However with
>>    VDUSE it fails with -EINVAL. My understanding is that we
>>    set VM_DONTEXPAND flag to the VMAs and madvise's
>>    MADV_DODUMP fails if it is present. I'm not sure to
>>    understand why madvise would prevent MADV_DODUMP if
>>    VM_DONTEXPAND is set. Any thoughts?
> 
> Adding Peter who may know the answer.

Thanks!
Maxime

> Thanks
> 
>>
>> [0]: https://patchwork.dpdk.org/project/dpdk/list/?series=27594&state=%2A&archive=both
>> [1]: https://lore.kernel.org/lkml/CACGkMEtgrxN3PPwsDo4oOsnsSLJfEmBEZ0WvjGRr3whU+QasUg@mail.gmail.com/T/
>>
>> Maxime Coquelin (2):
>>    vduse: validate block features only with block devices
>>    vduse: enable Virtio-net device type
>>
>>   drivers/vdpa/vdpa_user/vduse_dev.c | 11 +++++++----
>>   1 file changed, 7 insertions(+), 4 deletions(-)
>>
>> --
>> 2.39.2
>>
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ