linux-kernel - Re: [PATCH V6 19/19] virtio

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250924015115-mutt-send-email-mst@kernel.org>
Date: Wed, 24 Sep 2025 01:53:08 -0400
From: "Michael S. Tsirkin" <mst@...hat.com>
To: Jason Wang <jasowang@...hat.com>
Cc: xuanzhuo@...ux.alibaba.com, eperezma@...hat.com,
	virtualization@...ts.linux.dev, linux-kernel@...r.kernel.org
Subject: Re: [PATCH V6 19/19] virtio_ring: add in order support

On Wed, Sep 24, 2025 at 01:38:03PM +0800, Jason Wang wrote:
> On Mon, Sep 22, 2025 at 2:24 AM Michael S. Tsirkin <mst@...hat.com> wrote:
> >
> > On Fri, Sep 19, 2025 at 03:31:54PM +0800, Jason Wang wrote:
> > > This patch implements in order support for both split virtqueue and
> > > packed virtqueue. Perfomance could be gained for the device where the
> > > memory access could be expensive (e.g vhost-net or a real PCI device):
> > >
> > > Benchmark with KVM guest:
> > >
> > > Vhost-net on the host: (pktgen + XDP_DROP):
> > >
> > >          in_order=off | in_order=on | +%
> > >     TX:  5.20Mpps     | 6.20Mpps    | +19%
> > >     RX:  3.47Mpps     | 3.61Mpps    | + 4%
> > >
> > > Vhost-user(testpmd) on the host: (pktgen/XDP_DROP):
> > >
> > > For split virtqueue:
> > >
> > >          in_order=off | in_order=on | +%
> > >     TX:  5.60Mpps     | 5.60Mpps    | +0.0%
> > >     RX:  9.16Mpps     | 9.61Mpps    | +4.9%
> > >
> > > For packed virtqueue:
> > >
> > >          in_order=off | in_order=on | +%
> > >     TX:  5.60Mpps     | 5.70Mpps    | +1.7%
> > >     RX:  10.6Mpps     | 10.8Mpps    | +1.8%
> > >
> > > Benchmark also shows no performance impact for in_order=off for queue
> > > size with 256 and 1024.
> > >
> > > Signed-off-by: Jason Wang <jasowang@...hat.com>
> > > Signed-off-by: Michael S. Tsirkin <mst@...hat.com>
> > > ---
> > >  drivers/virtio/virtio_ring.c | 421 +++++++++++++++++++++++++++++++++--
> > >  1 file changed, 401 insertions(+), 20 deletions(-)
> > >
> > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > index b700aa3e56c3..c00b5e57f2fc 100644
> > > --- a/drivers/virtio/virtio_ring.c
> > > +++ b/drivers/virtio/virtio_ring.c
> > > @@ -70,6 +70,8 @@
> > >  enum vq_layout {
> > >       SPLIT = 0,
> > >       PACKED,
> > > +     SPLIT_IN_ORDER,
> > > +     PACKED_IN_ORDER,
> > >       VQ_TYPE_MAX,
> > >  };
> > >
> > > @@ -80,6 +82,7 @@ struct vring_desc_state_split {
> > >        * allocated together. So we won't stress more to the memory allocator.
> > >        */
> > >       struct vring_desc *indir_desc;
> > > +     u32 total_len;                  /* Buffer Length */
> > >  };
> > >
> > >  struct vring_desc_state_packed {
> > > @@ -91,6 +94,7 @@ struct vring_desc_state_packed {
> > >       struct vring_packed_desc *indir_desc;
> > >       u16 num;                        /* Descriptor list length. */
> > >       u16 last;                       /* The last desc state in a list. */
> > > +     u32 total_len;                  /* Buffer Length */
> > >  };
> > >
> > >  struct vring_desc_extra {
> > > @@ -206,6 +210,17 @@ struct vring_virtqueue {
> > >
> > >       /* Head of free buffer list. */
> > >       unsigned int free_head;
> > > +
> > > +     /*
> > > +      * With IN_ORDER, devices write a single used ring entry with
> > > +      * the id corresponding to the head entry of the descriptor chain
> > > +      * describing the last buffer in the batch
> > > +      */
> > > +     struct used_entry {
> > > +             u32 id;
> > > +             u32 len;
> > > +     } batch_last;
> > > +
> > >       /* Number we've added since last sync. */
> > >       unsigned int num_added;
> > >
> > > @@ -258,7 +273,12 @@ static void vring_free(struct virtqueue *_vq);
> > >
> > >  static inline bool virtqueue_is_packed(const struct vring_virtqueue *vq)
> > >  {
> > > -     return vq->layout == PACKED;
> > > +     return vq->layout == PACKED || vq->layout == PACKED_IN_ORDER;
> > > +}
> > > +
> > > +static inline bool virtqueue_is_in_order(const struct vring_virtqueue *vq)
> > > +{
> > > +     return vq->layout == SPLIT_IN_ORDER || vq->layout == PACKED_IN_ORDER;
> > >  }
> > >
> > >  static bool virtqueue_use_indirect(const struct vring_virtqueue *vq,
> > > @@ -575,6 +595,8 @@ static inline int virtqueue_add_split(struct vring_virtqueue *vq,
> > >       struct scatterlist *sg;
> > >       struct vring_desc *desc;
> > >       unsigned int i, n, avail, descs_used, err_idx, c = 0;
> > > +     /* Total length for in-order */
> > > +     unsigned int total_len = 0;
> > >       int head;
> > >       bool indirect;
> > >
> > > @@ -646,6 +668,7 @@ static inline int virtqueue_add_split(struct vring_virtqueue *vq,
> > >                                                    ++c == total_sg ?
> > >                                                    0 : VRING_DESC_F_NEXT,
> > >                                                    premapped);
> > > +                     total_len += len;
> > >               }
> > >       }
> > >       for (; n < (out_sgs + in_sgs); n++) {
> > > @@ -663,6 +686,7 @@ static inline int virtqueue_add_split(struct vring_virtqueue *vq,
> > >                               i, addr, len,
> > >                               (++c == total_sg ? 0 : VRING_DESC_F_NEXT) |
> > >                               VRING_DESC_F_WRITE, premapped);
> > > +                     total_len += len;
> > >               }
> > >       }
> > >
> > > @@ -685,7 +709,12 @@ static inline int virtqueue_add_split(struct vring_virtqueue *vq,
> > >       vq->vq.num_free -= descs_used;
> > >
> > >       /* Update free pointer */
> > > -     if (indirect)
> > > +     if (virtqueue_is_in_order(vq)) {
> > > +             vq->free_head += descs_used;
> > > +             if (vq->free_head >= vq->split.vring.num)
> > > +                     vq->free_head -= vq->split.vring.num;
> > > +             vq->split.desc_state[head].total_len = total_len;;
> > > +     } else if (indirect)
> > >               vq->free_head = vq->split.desc_extra[head].next;
> > >       else
> > >               vq->free_head = i;
> > > @@ -858,6 +887,14 @@ static bool more_used_split(const struct vring_virtqueue *vq)
> > >       return virtqueue_poll_split(vq, vq->last_used_idx);
> > >  }
> > >
> > > +static bool more_used_split_in_order(const struct vring_virtqueue *vq)
> > > +{
> > > +     if (vq->batch_last.id != vq->packed.vring.num)
> > > +             return true;
> >
> > Hmm why ->packed?
> 
> Right, it's a bug. Let me fix that.
> 
> >
> > This is actually a problem in this approach, kinda easy to get confused
> > which variant to call where.
> 
> Probably, but we have been doing this since the introduction of packed
> virtqueue.
> 
> >
> > Worth thinking how to fix this.
> >
> 
> Yes, but I think this series improves this by introducing the
> virtqueue ops. Optimization could be done on top.
> 
> For example, having separate files for packed and split with private structure.
> 
> Thanks

sure


Besides, LLMs are getting good at catching this kind of bug.


It might be enough to just add a file under Documentation/
describing the rules, at this point, plus a code comment
pointing there.


-- 
MST