lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CACGkMEsX9=MAT=pg+MG-5_0PN7DYGNv8G9E2ZLgtZ1TmCLV-0g@mail.gmail.com>
Date: Tue, 29 Jul 2025 10:33:48 +0800
From: Jason Wang <jasowang@...hat.com>
To: "Michael S. Tsirkin" <mst@...hat.com>
Cc: xuanzhuo@...ux.alibaba.com, eperezma@...hat.com, 
	virtualization@...ts.linux.dev, linux-kernel@...r.kernel.org
Subject: Re: [PATCH V5] virtio_ring: add in order support

On Mon, Jul 28, 2025 at 6:17 PM Michael S. Tsirkin <mst@...hat.com> wrote:
>
> On Mon, Jul 28, 2025 at 02:41:29PM +0800, Jason Wang wrote:
> > This patch implements in order support for both split virtqueue and
> > packed virtqueue. Perfomance could be gained for the device where the
> > memory access could be expensive (e.g vhost-net or a real PCI device):
> >
> > Benchmark with KVM guest:
> >
> > Vhost-net on the host: (pktgen + XDP_DROP):
> >
> >          in_order=off | in_order=on | +%
> >     TX:  5.20Mpps     | 6.20Mpps    | +19%
> >     RX:  3.47Mpps     | 3.61Mpps    | + 4%
> >
> > Vhost-user(testpmd) on the host: (pktgen/XDP_DROP):
> >
> > For split virtqueue:
> >
> >          in_order=off | in_order=on | +%
> >     TX:  5.60Mpps     | 5.60Mpps    | +0.0%
> >     RX:  9.16Mpps     | 9.61Mpps    | +4.9%
> >
> > For packed virtqueue:
> >
> >          in_order=off | in_order=on | +%
> >     TX:  5.60Mpps     | 5.70Mpps    | +1.7%
> >     RX:  10.6Mpps     | 10.8Mpps    | +1.8%
> >
> > Benchmark also shows no performance impact for in_order=off for queue
> > size with 256 and 1024.
> >
> > Signed-off-by: Jason Wang <jasowang@...hat.com>
> > Signed-off-by: Michael S. Tsirkin <mst@...hat.com>
> > ---
> > Changes since V4:
> > - Fix build error when DEBUG is enabled
> > - Fix function duplications
> > - Remove unnecessary new lines
> > ---
> > drivers/virtio/virtio_ring.c | 421 +++++++++++++++++++++++++++++++++--
> >  1 file changed, 401 insertions(+), 20 deletions(-)
> >
> > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > index e675d8305dbf..c6558e271f97 100644
> > --- a/drivers/virtio/virtio_ring.c
> > +++ b/drivers/virtio/virtio_ring.c
> > @@ -70,11 +70,14 @@
> >  enum vq_layout {
> >       SPLIT = 0,
> >       PACKED,
> > +     SPLIT_IN_ORDER,
> > +     PACKED_IN_ORDER,
> >       VQ_TYPE_MAX,
> >  };
>
>
> how about specifying the #s here?
>         SPLIT = 0,
>         PACKED = 1,
>         IN_ORDER = 2,
>         SPLIT_IN_ORDER = 2,
>         PACKED_IN_ORDER = 3,
>         VQ_TYPE_MAX,
>
> and then
>
>   static inline bool virtqueue_is_packed(const struct vring_virtqueue *vq)
>   {
>         return vq->layout & PACKED;
>  }
>
>  static inline bool virtqueue_is_in_order(const struct vring_virtqueue *vq)
>  {
>         return vq->layout & IN_ORDER;
>   }
>
> which is a tiny bit less code.
>
> worth doing?

Probably not, for example it would introduce branches. As we
discussed, once we have sufficient optimizations, most of the branches
could be saved.

>
> >
> >  struct vring_desc_state_split {
> >       void *data;                     /* Data for callback. */
> > +     u32 total_len;                  /* Buffer Length */
> >
> >       /* Indirect desc table and extra table, if any. These two will be
> >        * allocated together. So we won't stress more to the memory allocator.
>
>
> this is only used for in_order, and it increases the struct size
> by half due to padding. why not a separate struct?
> Or if you like, reuse vring_desc_state_packed - it is same
> size with this addition.
>
>
> > @@ -84,6 +87,7 @@ struct vring_desc_state_split {
> >
> >  struct vring_desc_state_packed {
> >       void *data;                     /* Data for callback. */
> > +     u32 total_len;                  /* Buffer Length */
> >
> >       /* Indirect desc table and extra table, if any. These two will be
> >        * allocated together. So we won't stress more to the memory allocator.
>
> there's empty space at the end of this struct.
> struct vring_desc_state_packed {
>         void *data;                     /* Data for callback. */
>         u32 total_len;                  /* Buffer Length */
>
>         /* Indirect desc table and extra table, if any. These two will be
>          * allocated together. So we won't stress more to the memory allocator.
>          */
>         struct vring_packed_desc *indir_desc;
>         u16 num;                        /* Descriptor list length. */
>         u16 last;                       /* The last desc state in a list. */
> };
>
> why not put it there?
>

Fine.

Thanks


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ