lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 10 Jan 2013 19:39:05 +0100
From:	Sjur Brændeland <sjurbren@...il.com>
To:	Rusty Russell <rusty@...tcorp.com.au>
Cc:	"Michael S. Tsirkin" <mst@...hat.com>,
	Linus Walleij <linus.walleij@...aro.org>,
	virtualization@...ts.linux-foundation.org,
	LKML <linux-kernel@...r.kernel.org>,
	Sjur Brændeland <sjur.brandeland@...ricsson.com>,
	Ohad Ben-Cohen <ohad@...ery.com>
Subject: Re: [RFCv2 00/12] Introduce host-side virtio queue and CAIF Virtio.

Hi Rusty,

On Thu, Jan 10, 2013 at 11:30 AM, Rusty Russell <rusty@...tcorp.com.au> wrote:
...
>I now have some lightly-tested code (via a userspace harness).

Great - thank you for looking into this. I will start integrating this
with my patches
when you send out a proper patch.

...
> diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
> index 8d5bddb..fd95d3e 100644
> --- a/drivers/virtio/Kconfig
> +++ b/drivers/virtio/Kconfig
> @@ -5,6 +5,12 @@ config VIRTIO
>           bus, such as CONFIG_VIRTIO_PCI, CONFIG_VIRTIO_MMIO, CONFIG_LGUEST,
>           CONFIG_RPMSG or CONFIG_S390_GUEST.
>
> +config VHOST
> +       tristate

Inclusion of drivers/virtio from drivers/Makefile depends on VIRTIO.
So I guess VHOST should select VIRTIO to ensure that
drivers/virtio/virtio_host.c
is part of the build.

> +       ---help---
> +         This option is selected by any driver which needs to access
> +         the host side of a virtio ring.
> +
...
> +/* Returns vring->num if empty, -ve on error. */
> +static inline int __vringh_get_head(const struct vringh *vrh,
> +                                   int (*getu16)(u16 *val, const u16 *p),
> +                                   u16 *last_avail_idx)
> +{
> +       u16 avail_idx, i, head;
> +       int err;
> +
> +       err = getu16(&avail_idx, &vrh->vring.avail->idx);
> +       if (err) {
> +               vringh_bad("Failed to access avail idx at %p",
> +                          &vrh->vring.avail->idx);
> +               return err;
> +       }
> +
> +       err = getu16(last_avail_idx, &vring_avail_event(&vrh->vring));
> +       if (err) {
> +               vringh_bad("Failed to access last avail idx at %p",
> +                          &vring_avail_event(&vrh->vring));
> +               return err;
> +       }
> +
> +       if (*last_avail_idx == avail_idx)
> +               return vrh->vring.num;
> +
> +       /* Only get avail ring entries after they have been exposed by guest. */
> +       smp_rmb();

We are accessing memory shared with a remote device (modem), so we probably
need mandatory barriers here, e.g. something like the virtio_rmb
defined in virtio_ring.c.

> +
> +       i = *last_avail_idx & (vrh->vring.num - 1);
> +
> +       err = getu16(&head, &vrh->vring.avail->ring[i]);
> +       if (err) {
> +               vringh_bad("Failed to read head: idx %d address %p",
> +                          *last_avail_idx, &vrh->vring.avail->ring[i]);
> +               return err;
> +       }
> +
> +       if (head >= vrh->vring.num) {
> +               vringh_bad("Guest says index %u > %u is available",
> +                          head, vrh->vring.num);
> +               return -EINVAL;
> +       }
> +       return head;
> +}
...
> +static inline int
> +__vringh_sg(struct vringh_acc *acc,
> +           struct vringh_sg *vsg,
> +           unsigned max,
> +           u16 write_flag,
> +           gfp_t gfp,
> +           int (*getdesc)(struct vring_desc *dst, const struct vring_desc *s))
> +{
> +       unsigned count = 0, num_descs = 0;
> +       struct vringh_sg *orig_vsg = vsg;
> +       int err;
> +
> +       /* This sends end marker on sg[max-1], so we know when to chain. */
> +       if (max)
> +               sg_init_table(&vsg->sg, max);
> +
> +       for (;;) {
> +               /* Exhausted this descriptor?  Read next. */
> +               if (acc->desc.len == 0) {
> +                       if (!(acc->desc.flags & VRING_DESC_F_NEXT))
> +                               break;
> +
> +                       if (num_descs++ == acc->max) {
> +                               err = -ELOOP;
> +                               goto fail;
> +                       }
> +
> +                       if (acc->desc.next >= acc->max) {
> +                               vringh_bad("Chained index %u > %u",
> +                                          acc->desc.next, acc->max);
> +                               err = -EINVAL;
> +                               goto fail;
> +                       }
> +
> +                       acc->idx = acc->desc.next;
> +                       err = getdesc(&acc->desc, acc->start + acc->idx);
> +                       if (unlikely(err))
> +                               goto fail;
> +               }
> +
> +               if (unlikely(!max)) {
> +                       vringh_bad("Unexpected %s descriptor",
> +                                  write_flag ? "writable" : "readable");
> +                       return -EINVAL;
> +               }
> +
> +               /* No more readable/writable descriptors? */
> +               if ((acc->desc.flags & VRING_DESC_F_WRITE) != write_flag) {
> +                       /* We should not have readable after writable */
> +                       if (write_flag) {
> +                               vringh_bad("Readable desc %p after writable",
> +                                          acc->start + acc->idx);
> +                               err = -EINVAL;
> +                               goto fail;
> +                       }
> +                       break;
> +               }
> +
> +               /* Append the pages into the sg. */
> +               err = add_to_sg(&vsg, (void *)(long)acc->desc.addr,
> +                               acc->desc.len, gfp);

I would prefer not to split into pages at this point, but rather provide an
iterator or the original list found in the descriptor to the client.

In our case we use virtio rings to talk to a LTE-modem over shared memory.
The IP traffic is received over the air, interleaved and arrives in
the virtio driver in
large bursts. So virtio driver on the modem receives multiple datagrams
held in large contiguous buffers. Our current approach is to handle each
buffer as a chained descriptor list, where each datagram is kept in
separate chained descriptors. When the buffers are consumed on the linux
host, the modem will read the chained descriptors from the used-ring and
free the entire contiguous buffer in one operation.

So I would prefer if we could avoid this approach of splitting buffers
received in
the ring into multiple sg-list entries as this would break the current
CAIF virtio
implementation.

Regards,
Sjur
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ