lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <50E54DC0.4040609@redhat.com>
Date:	Thu, 03 Jan 2013 10:22:08 +0100
From:	Paolo Bonzini <pbonzini@...hat.com>
To:	Rusty Russell <rusty@...tcorp.com.au>
CC:	linux-kernel@...r.kernel.org, linux-scsi@...r.kernel.org,
	kvm@...r.kernel.org, mst@...hat.com, hutao@...fujitsu.com,
	virtualization@...ts.linux-foundation.org, stefanha@...hat.com
Subject: Re: [PATCH v2 1/5] virtio: add functions for piecewise addition of
 buffers

Il 02/01/2013 06:03, Rusty Russell ha scritto:
> Paolo Bonzini <pbonzini@...hat.com> writes:
>> The virtqueue_add_buf function has two limitations:
>>
>> 1) it requires the caller to provide all the buffers in a single call;
>>
>> 2) it does not support chained scatterlists: the buffers must be
>> provided as an array of struct scatterlist;
> 
> Chained scatterlists are a horrible interface, but that doesn't mean we
> shouldn't support them if there's a need.
> 
> I think I once even had a patch which passed two chained sgs, rather
> than a combo sg and two length numbers.  It's very old, but I've pasted
> it below.
> 
> Duplicating the implementation by having another interface is pretty
> nasty; I think I'd prefer the chained scatterlists, if that's optimal
> for you.

Unfortunately, that cannot work because not all architectures support
chained scatterlists.  Having two different implementations on the
driver side, with different locking rules, depending on the support for
chained scatterlists is awful.

(Also, as you mention chained scatterlists are horrible.  They'd happen
to work for virtio-scsi, but not for virtio-blk where the response
status is part of the footer, not the header).

We can move all drivers to virtqueue_add_sg, I just didn't do it in this
series because I first wanted to get the API right.

Paolo

> Cheers,
> Rusty.
> 
> From: Rusty Russell <rusty@...tcorp.com.au>
> Subject: virtio: use chained scatterlists.
> 
> Rather than handing a scatterlist[] and out and in numbers to
> virtqueue_add_buf(), hand two separate ones which can be chained.
> 
> I shall refrain from ranting about what a disgusting hack chained
> scatterlists are.  I'll just note that this doesn't make things
> simpler (see diff).
> 
> The scatterlists we use can be too large for the stack, so we put them
> in our device struct and reuse them.  But in many cases we don't want
> to pay the cost of sg_init_table() as we don't know how many elements
> we'll have and we'd have to initialize the entire table.
> 
> This means we have two choices: carefully reset the end markers after
> we call virtqueue_add_buf(), which we do in virtio_net for the xmit
> path where it's easy and we want to be optimal.  Elsewhere we
> implement a helper to unset the end markers after we've filled the
> array.
> 
> Signed-off-by: Rusty Russell <rusty@...tcorp.com.au>
> ---
>  drivers/block/virtio_blk.c          |   37 +++++++++++++-----
>  drivers/char/hw_random/virtio-rng.c |    2 -
>  drivers/char/virtio_console.c       |    6 +--
>  drivers/net/virtio_net.c            |   67 ++++++++++++++++++---------------
>  drivers/virtio/virtio_balloon.c     |    6 +--
>  drivers/virtio/virtio_ring.c        |   71 ++++++++++++++++++++++--------------
>  include/linux/virtio.h              |    5 +-
>  net/9p/trans_virtio.c               |   38 +++++++++++++++++--
>  8 files changed, 151 insertions(+), 81 deletions(-)
> 
> diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
> --- a/drivers/block/virtio_blk.c
> +++ b/drivers/block/virtio_blk.c
> @@ -100,6 +100,14 @@ static void blk_done(struct virtqueue *v
>  	spin_unlock_irqrestore(vblk->disk->queue->queue_lock, flags);
>  }
>  
> +static void sg_unset_end_markers(struct scatterlist *sg, unsigned int num)
> +{
> +	unsigned int i;
> +
> +	for (i = 0; i < num; i++)
> +		sg[i].page_link &= ~0x02;
> +}
> +
>  static bool do_req(struct request_queue *q, struct virtio_blk *vblk,
>  		   struct request *req)
>  {
> @@ -140,6 +148,7 @@ static bool do_req(struct request_queue 
>  		}
>  	}
>  
> +	/* We layout out scatterlist in a single array, out then in. */
>  	sg_set_buf(&vblk->sg[out++], &vbr->out_hdr, sizeof(vbr->out_hdr));
>  
>  	/*
> @@ -151,17 +160,8 @@ static bool do_req(struct request_queue 
>  	if (vbr->req->cmd_type == REQ_TYPE_BLOCK_PC)
>  		sg_set_buf(&vblk->sg[out++], vbr->req->cmd, vbr->req->cmd_len);
>  
> +	/* This marks the end of the sg list at vblk->sg[out]. */
>  	num = blk_rq_map_sg(q, vbr->req, vblk->sg + out);
> -
> -	if (vbr->req->cmd_type == REQ_TYPE_BLOCK_PC) {
> -		sg_set_buf(&vblk->sg[num + out + in++], vbr->req->sense, SCSI_SENSE_BUFFERSIZE);
> -		sg_set_buf(&vblk->sg[num + out + in++], &vbr->in_hdr,
> -			   sizeof(vbr->in_hdr));
> -	}
> -
> -	sg_set_buf(&vblk->sg[num + out + in++], &vbr->status,
> -		   sizeof(vbr->status));
> -
>  	if (num) {
>  		if (rq_data_dir(vbr->req) == WRITE) {
>  			vbr->out_hdr.type |= VIRTIO_BLK_T_OUT;
> @@ -172,7 +172,22 @@ static bool do_req(struct request_queue 
>  		}
>  	}
>  
> -	if (virtqueue_add_buf(vblk->vq, vblk->sg, out, in, vbr, GFP_ATOMIC)<0) {
> +	if (vbr->req->cmd_type == REQ_TYPE_BLOCK_PC) {
> +		sg_set_buf(&vblk->sg[out + in++], vbr->req->sense,
> +			   SCSI_SENSE_BUFFERSIZE);
> +		sg_set_buf(&vblk->sg[out + in++], &vbr->in_hdr,
> +			   sizeof(vbr->in_hdr));
> +	}
> +
> +	sg_set_buf(&vblk->sg[out + in++], &vbr->status,
> +		   sizeof(vbr->status));
> +
> +	sg_unset_end_markers(vblk->sg, out+in);
> +	sg_mark_end(&vblk->sg[out-1]);
> +	sg_mark_end(&vblk->sg[out+in-1]);
> +
> +	if (virtqueue_add_buf(vblk->vq, vblk->sg, vblk->sg+out, vbr, GFP_ATOMIC)
> +	    < 0) {
>  		mempool_free(vbr, vblk->pool);
>  		return false;
>  	}
> diff --git a/drivers/char/hw_random/virtio-rng.c b/drivers/char/hw_random/virtio-rng.c
> --- a/drivers/char/hw_random/virtio-rng.c
> +++ b/drivers/char/hw_random/virtio-rng.c
> @@ -47,7 +47,7 @@ static void register_buffer(u8 *buf, siz
>  	sg_init_one(&sg, buf, size);
>  
>  	/* There should always be room for one buffer. */
> -	if (virtqueue_add_buf(vq, &sg, 0, 1, buf, GFP_KERNEL) < 0)
> +	if (virtqueue_add_buf(vq, NULL, &sg, buf, GFP_KERNEL) < 0)
>  		BUG();
>  
>  	virtqueue_kick(vq);
> diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
> --- a/drivers/char/virtio_console.c
> +++ b/drivers/char/virtio_console.c
> @@ -392,7 +392,7 @@ static int add_inbuf(struct virtqueue *v
>  
>  	sg_init_one(sg, buf->buf, buf->size);
>  
> -	ret = virtqueue_add_buf(vq, sg, 0, 1, buf, GFP_ATOMIC);
> +	ret = virtqueue_add_buf(vq, NULL, sg, buf, GFP_ATOMIC);
>  	virtqueue_kick(vq);
>  	return ret;
>  }
> @@ -457,7 +457,7 @@ static ssize_t __send_control_msg(struct
>  	vq = portdev->c_ovq;
>  
>  	sg_init_one(sg, &cpkt, sizeof(cpkt));
> -	if (virtqueue_add_buf(vq, sg, 1, 0, &cpkt, GFP_ATOMIC) >= 0) {
> +	if (virtqueue_add_buf(vq, sg, NULL, &cpkt, GFP_ATOMIC) >= 0) {
>  		virtqueue_kick(vq);
>  		while (!virtqueue_get_buf(vq, &len))
>  			cpu_relax();
> @@ -506,7 +506,7 @@ static ssize_t send_buf(struct port *por
>  	reclaim_consumed_buffers(port);
>  
>  	sg_init_one(sg, in_buf, in_count);
> -	ret = virtqueue_add_buf(out_vq, sg, 1, 0, in_buf, GFP_ATOMIC);
> +	ret = virtqueue_add_buf(out_vq, sg, NULL, in_buf, GFP_ATOMIC);
>  
>  	/* Tell Host to go! */
>  	virtqueue_kick(out_vq);
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -376,11 +376,11 @@ static int add_recvbuf_small(struct virt
>  	skb_put(skb, MAX_PACKET_LEN);
>  
>  	hdr = skb_vnet_hdr(skb);
> +	sg_init_table(vi->rx_sg, 2);
>  	sg_set_buf(vi->rx_sg, &hdr->hdr, sizeof hdr->hdr);
> -
>  	skb_to_sgvec(skb, vi->rx_sg + 1, 0, skb->len);
>  
> -	err = virtqueue_add_buf(vi->rvq, vi->rx_sg, 0, 2, skb, gfp);
> +	err = virtqueue_add_buf(vi->rvq, NULL, vi->rx_sg, skb, gfp);
>  	if (err < 0)
>  		dev_kfree_skb(skb);
>  
> @@ -393,6 +393,8 @@ static int add_recvbuf_big(struct virtne
>  	char *p;
>  	int i, err, offset;
>  
> +	sg_init_table(vi->rx_sg, MAX_SKB_FRAGS + 1);
> +
>  	/* page in vi->rx_sg[MAX_SKB_FRAGS + 1] is list tail */
>  	for (i = MAX_SKB_FRAGS + 1; i > 1; --i) {
>  		first = get_a_page(vi, gfp);
> @@ -425,8 +427,8 @@ static int add_recvbuf_big(struct virtne
>  
>  	/* chain first in list head */
>  	first->private = (unsigned long)list;
> -	err = virtqueue_add_buf(vi->rvq, vi->rx_sg, 0, MAX_SKB_FRAGS + 2,
> -				first, gfp);
> +
> +	err = virtqueue_add_buf(vi->rvq, NULL, vi->rx_sg, first, gfp);
>  	if (err < 0)
>  		give_pages(vi, first);
>  
> @@ -444,7 +446,7 @@ static int add_recvbuf_mergeable(struct 
>  
>  	sg_init_one(vi->rx_sg, page_address(page), PAGE_SIZE);
>  
> -	err = virtqueue_add_buf(vi->rvq, vi->rx_sg, 0, 1, page, gfp);
> +	err = virtqueue_add_buf(vi->rvq, NULL, vi->rx_sg, page, gfp);
>  	if (err < 0)
>  		give_pages(vi, page);
>  
> @@ -581,6 +583,7 @@ static int xmit_skb(struct virtnet_info 
>  {
>  	struct skb_vnet_hdr *hdr = skb_vnet_hdr(skb);
>  	const unsigned char *dest = ((struct ethhdr *)skb->data)->h_dest;
> +	int ret;
>  
>  	pr_debug("%s: xmit %p %pM\n", vi->dev->name, skb, dest);
>  
> @@ -620,8 +623,16 @@ static int xmit_skb(struct virtnet_info 
>  		sg_set_buf(vi->tx_sg, &hdr->hdr, sizeof hdr->hdr);
>  
>  	hdr->num_sg = skb_to_sgvec(skb, vi->tx_sg + 1, 0, skb->len) + 1;
> -	return virtqueue_add_buf(vi->svq, vi->tx_sg, hdr->num_sg,
> -				 0, skb, GFP_ATOMIC);
> +
> +	ret = virtqueue_add_buf(vi->svq, vi->tx_sg, NULL, skb, GFP_ATOMIC);
> +
> +	/*
> +	 * An optimization: clear the end bit set by skb_to_sgvec, so
> +	 * we can simply re-use vi->tx_sg[] next time.
> +	 */
> +	vi->tx_sg[hdr->num_sg-1].page_link &= ~0x02;
> +
> +	return ret;
>  }
>  
>  static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
> @@ -757,32 +768,31 @@ static int virtnet_open(struct net_devic
>   * never fail unless improperly formated.
>   */
>  static bool virtnet_send_command(struct virtnet_info *vi, u8 class, u8 cmd,
> -				 struct scatterlist *data, int out, int in)
> +				 struct scatterlist *cmdsg)
>  {
> -	struct scatterlist *s, sg[VIRTNET_SEND_COMMAND_SG_MAX + 2];
> +	struct scatterlist in[1], out[2];
>  	struct virtio_net_ctrl_hdr ctrl;
>  	virtio_net_ctrl_ack status = ~0;
>  	unsigned int tmp;
> -	int i;
>  
>  	/* Caller should know better */
> -	BUG_ON(!virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_VQ) ||
> -		(out + in > VIRTNET_SEND_COMMAND_SG_MAX));
> +	BUG_ON(!virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_VQ));
>  
> -	out++; /* Add header */
> -	in++; /* Add return status */
> -
> +	/* Prepend header to output. */
> +	sg_init_table(out, 2);
>  	ctrl.class = class;
>  	ctrl.cmd = cmd;
> +	sg_set_buf(&out[0], &ctrl, sizeof(ctrl));
> +	if (cmdsg)
> +		sg_chain(out, 2, cmdsg);
> +	else
> +		sg_mark_end(&out[0]);
>  
> -	sg_init_table(sg, out + in);
> +	/* Status response. */
> +	sg_init_one(in, &status, sizeof(status));
>  
> -	sg_set_buf(&sg[0], &ctrl, sizeof(ctrl));
> -	for_each_sg(data, s, out + in - 2, i)
> -		sg_set_buf(&sg[i + 1], sg_virt(s), s->length);
> -	sg_set_buf(&sg[out + in - 1], &status, sizeof(status));
>  
> -	BUG_ON(virtqueue_add_buf(vi->cvq, sg, out, in, vi, GFP_ATOMIC) < 0);
> +	BUG_ON(virtqueue_add_buf(vi->cvq, out, in, vi, GFP_ATOMIC) < 0);
>  
>  	virtqueue_kick(vi->cvq);
>  
> @@ -800,8 +810,7 @@ static void virtnet_ack_link_announce(st
>  {
>  	rtnl_lock();
>  	if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_ANNOUNCE,
> -				  VIRTIO_NET_CTRL_ANNOUNCE_ACK, NULL,
> -				  0, 0))
> +				  VIRTIO_NET_CTRL_ANNOUNCE_ACK, NULL))
>  		dev_warn(&vi->dev->dev, "Failed to ack link announce.\n");
>  	rtnl_unlock();
>  }
> @@ -839,16 +848,14 @@ static void virtnet_set_rx_mode(struct n
>  	sg_init_one(sg, &promisc, sizeof(promisc));
>  
>  	if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_RX,
> -				  VIRTIO_NET_CTRL_RX_PROMISC,
> -				  sg, 1, 0))
> +				  VIRTIO_NET_CTRL_RX_PROMISC, sg))
>  		dev_warn(&dev->dev, "Failed to %sable promisc mode.\n",
>  			 promisc ? "en" : "dis");
>  
>  	sg_init_one(sg, &allmulti, sizeof(allmulti));
>  
>  	if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_RX,
> -				  VIRTIO_NET_CTRL_RX_ALLMULTI,
> -				  sg, 1, 0))
> +				  VIRTIO_NET_CTRL_RX_ALLMULTI, sg))
>  		dev_warn(&dev->dev, "Failed to %sable allmulti mode.\n",
>  			 allmulti ? "en" : "dis");
>  
> @@ -887,7 +894,7 @@ static void virtnet_set_rx_mode(struct n
>  
>  	if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_MAC,
>  				  VIRTIO_NET_CTRL_MAC_TABLE_SET,
> -				  sg, 2, 0))
> +				  sg))
>  		dev_warn(&dev->dev, "Failed to set MAC fitler table.\n");
>  
>  	kfree(buf);
> @@ -901,7 +908,7 @@ static int virtnet_vlan_rx_add_vid(struc
>  	sg_init_one(&sg, &vid, sizeof(vid));
>  
>  	if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_VLAN,
> -				  VIRTIO_NET_CTRL_VLAN_ADD, &sg, 1, 0))
> +				  VIRTIO_NET_CTRL_VLAN_ADD, &sg))
>  		dev_warn(&dev->dev, "Failed to add VLAN ID %d.\n", vid);
>  	return 0;
>  }
> @@ -914,7 +921,7 @@ static int virtnet_vlan_rx_kill_vid(stru
>  	sg_init_one(&sg, &vid, sizeof(vid));
>  
>  	if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_VLAN,
> -				  VIRTIO_NET_CTRL_VLAN_DEL, &sg, 1, 0))
> +				  VIRTIO_NET_CTRL_VLAN_DEL, &sg))
>  		dev_warn(&dev->dev, "Failed to kill VLAN ID %d.\n", vid);
>  	return 0;
>  }
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -102,7 +102,7 @@ static void tell_host(struct virtio_ball
>  	sg_init_one(&sg, vb->pfns, sizeof(vb->pfns[0]) * vb->num_pfns);
>  
>  	/* We should always be able to add one buffer to an empty queue. */
> -	if (virtqueue_add_buf(vq, &sg, 1, 0, vb, GFP_KERNEL) < 0)
> +	if (virtqueue_add_buf(vq, &sg, NULL, vb, GFP_KERNEL) < 0)
>  		BUG();
>  	virtqueue_kick(vq);
>  
> @@ -246,7 +246,7 @@ static void stats_handle_request(struct 
>  	if (!virtqueue_get_buf(vq, &len))
>  		return;
>  	sg_init_one(&sg, vb->stats, sizeof(vb->stats));
> -	if (virtqueue_add_buf(vq, &sg, 1, 0, vb, GFP_KERNEL) < 0)
> +	if (virtqueue_add_buf(vq, &sg, NULL, vb, GFP_KERNEL) < 0)
>  		BUG();
>  	virtqueue_kick(vq);
>  }
> @@ -331,7 +331,7 @@ static int init_vqs(struct virtio_balloo
>  		 * use it to signal us later.
>  		 */
>  		sg_init_one(&sg, vb->stats, sizeof vb->stats);
> -		if (virtqueue_add_buf(vb->stats_vq, &sg, 1, 0, vb, GFP_KERNEL)
> +		if (virtqueue_add_buf(vb->stats_vq, &sg, NULL, vb, GFP_KERNEL)
>  		    < 0)
>  			BUG();
>  		virtqueue_kick(vb->stats_vq);
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -121,35 +121,41 @@ struct vring_virtqueue
>  
>  #define to_vvq(_vq) container_of(_vq, struct vring_virtqueue, vq)
>  
> +/* This doesn't have the counter that for_each_sg() has */
> +#define foreach_sg(sglist, i)			\
> +	for (i = (sglist); i; i = sg_next(i))
> +
>  /* Set up an indirect table of descriptors and add it to the queue. */
>  static int vring_add_indirect(struct vring_virtqueue *vq,
> -			      struct scatterlist sg[],
> -			      unsigned int out,
> -			      unsigned int in,
> +			      unsigned int num,
> +			      const struct scatterlist *out,
> +			      const struct scatterlist *in,
>  			      gfp_t gfp)
>  {
> +	const struct scatterlist *sg;
>  	struct vring_desc *desc;
>  	unsigned head;
>  	int i;
>  
> -	desc = kmalloc((out + in) * sizeof(struct vring_desc), gfp);
> +	desc = kmalloc(num * sizeof(struct vring_desc), gfp);
>  	if (!desc)
>  		return -ENOMEM;
>  
>  	/* Transfer entries from the sg list into the indirect page */
> -	for (i = 0; i < out; i++) {
> +	i = 0;
> +	foreach_sg(out, sg) {
>  		desc[i].flags = VRING_DESC_F_NEXT;
>  		desc[i].addr = sg_phys(sg);
>  		desc[i].len = sg->length;
>  		desc[i].next = i+1;
> -		sg++;
> +		i++;
>  	}
> -	for (; i < (out + in); i++) {
> +	foreach_sg(in, sg) {
>  		desc[i].flags = VRING_DESC_F_NEXT|VRING_DESC_F_WRITE;
>  		desc[i].addr = sg_phys(sg);
>  		desc[i].len = sg->length;
>  		desc[i].next = i+1;
> -		sg++;
> +		i++;
>  	}
>  
>  	/* Last one doesn't continue. */
> @@ -171,12 +177,21 @@ static int vring_add_indirect(struct vri
>  	return head;
>  }
>  
> +static unsigned int count_sg(const struct scatterlist *sg)
> +{
> +	unsigned int count = 0;
> +	const struct scatterlist *i;
> +
> +	foreach_sg(sg, i)
> +		count++;
> +	return count;
> +}
> +
>  /**
>   * virtqueue_add_buf - expose buffer to other end
>   * @vq: the struct virtqueue we're talking about.
> - * @sg: the description of the buffer(s).
> - * @out_num: the number of sg readable by other side
> - * @in_num: the number of sg which are writable (after readable ones)
> + * @out: the description of the output buffer(s).
> + * @in: the description of the input buffer(s).
>   * @data: the token identifying the buffer.
>   * @gfp: how to do memory allocations (if necessary).
>   *
> @@ -189,20 +204,23 @@ static int vring_add_indirect(struct vri
>   * we can put an entire sg[] array inside a single queue entry.
>   */
>  int virtqueue_add_buf(struct virtqueue *_vq,
> -		      struct scatterlist sg[],
> -		      unsigned int out,
> -		      unsigned int in,
> +		      const struct scatterlist *out,
> +		      const struct scatterlist *in,
>  		      void *data,
>  		      gfp_t gfp)
>  {
>  	struct vring_virtqueue *vq = to_vvq(_vq);
> -	unsigned int i, avail, uninitialized_var(prev);
> +	unsigned int i, avail, uninitialized_var(prev), num;
> +	const struct scatterlist *sg;
>  	int head;
>  
>  	START_USE(vq);
>  
>  	BUG_ON(data == NULL);
>  
> +	num = count_sg(out) + count_sg(in);
> +	BUG_ON(num == 0);
> +
>  #ifdef DEBUG
>  	{
>  		ktime_t now = ktime_get();
> @@ -218,18 +236,17 @@ int virtqueue_add_buf(struct virtqueue *
>  
>  	/* If the host supports indirect descriptor tables, and we have multiple
>  	 * buffers, then go indirect. FIXME: tune this threshold */
> -	if (vq->indirect && (out + in) > 1 && vq->num_free) {
> -		head = vring_add_indirect(vq, sg, out, in, gfp);
> +	if (vq->indirect && num > 1 && vq->num_free) {
> +		head = vring_add_indirect(vq, num, out, in, gfp);
>  		if (likely(head >= 0))
>  			goto add_head;
>  	}
>  
> -	BUG_ON(out + in > vq->vring.num);
> -	BUG_ON(out + in == 0);
> +	BUG_ON(num > vq->vring.num);
>  
> -	if (vq->num_free < out + in) {
> +	if (vq->num_free < num) {
>  		pr_debug("Can't add buf len %i - avail = %i\n",
> -			 out + in, vq->num_free);
> +			 num, vq->num_free);
>  		/* FIXME: for historical reasons, we force a notify here if
>  		 * there are outgoing parts to the buffer.  Presumably the
>  		 * host should service the ring ASAP. */
> @@ -240,22 +257,24 @@ int virtqueue_add_buf(struct virtqueue *
>  	}
>  
>  	/* We're about to use some buffers from the free list. */
> -	vq->num_free -= out + in;
> +	vq->num_free -= num;
>  
>  	head = vq->free_head;
> -	for (i = vq->free_head; out; i = vq->vring.desc[i].next, out--) {
> +
> +	i = vq->free_head;
> +	foreach_sg(out, sg) {
>  		vq->vring.desc[i].flags = VRING_DESC_F_NEXT;
>  		vq->vring.desc[i].addr = sg_phys(sg);
>  		vq->vring.desc[i].len = sg->length;
>  		prev = i;
> -		sg++;
> +		i = vq->vring.desc[i].next;
>  	}
> -	for (; in; i = vq->vring.desc[i].next, in--) {
> +	foreach_sg(in, sg) {
>  		vq->vring.desc[i].flags = VRING_DESC_F_NEXT|VRING_DESC_F_WRITE;
>  		vq->vring.desc[i].addr = sg_phys(sg);
>  		vq->vring.desc[i].len = sg->length;
>  		prev = i;
> -		sg++;
> +		i = vq->vring.desc[i].next;
>  	}
>  	/* Last one doesn't continue. */
>  	vq->vring.desc[prev].flags &= ~VRING_DESC_F_NEXT;
> diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> --- a/include/linux/virtio.h
> +++ b/include/linux/virtio.h
> @@ -26,9 +26,8 @@ struct virtqueue {
>  };
>  
>  int virtqueue_add_buf(struct virtqueue *vq,
> -		      struct scatterlist sg[],
> -		      unsigned int out_num,
> -		      unsigned int in_num,
> +		      const struct scatterlist *out,
> +		      const struct scatterlist *in,
>  		      void *data,
>  		      gfp_t gfp);
>  
> diff --git a/net/9p/trans_virtio.c b/net/9p/trans_virtio.c
> --- a/net/9p/trans_virtio.c
> +++ b/net/9p/trans_virtio.c
> @@ -244,6 +244,14 @@ pack_sg_list_p(struct scatterlist *sg, i
>  	return index - start;
>  }
>  
> +static void sg_unset_end_markers(struct scatterlist *sg, unsigned int num)
> +{
> +	unsigned int i;
> +
> +	for (i = 0; i < num; i++)
> +		sg[i].page_link &= ~0x02;
> +}
> +
>  /**
>   * p9_virtio_request - issue a request
>   * @client: client instance issuing the request
> @@ -258,6 +266,7 @@ p9_virtio_request(struct p9_client *clie
>  	int in, out;
>  	unsigned long flags;
>  	struct virtio_chan *chan = client->trans;
> +	const struct scatterlist *outsg = NULL, *insg = NULL;
>  
>  	p9_debug(P9_DEBUG_TRANS, "9p debug: virtio request\n");
>  
> @@ -268,12 +277,21 @@ req_retry:
>  	/* Handle out VirtIO ring buffers */
>  	out = pack_sg_list(chan->sg, 0,
>  			   VIRTQUEUE_NUM, req->tc->sdata, req->tc->size);
> +	if (out) {
> +		sg_unset_end_markers(chan->sg, out-1);
> +		sg_mark_end(&chan->sg[out-1]);
> +		outsg = chan->sg;
> +	}
>  
>  	in = pack_sg_list(chan->sg, out,
>  			  VIRTQUEUE_NUM, req->rc->sdata, req->rc->capacity);
> +	if (in) {
> +		sg_unset_end_markers(chan->sg+out, in-1);
> +		sg_mark_end(&chan->sg[out+in-1]);
> +		insg = chan->sg+out;
> +	}
>  
> -	err = virtqueue_add_buf(chan->vq, chan->sg, out, in, req->tc,
> -				GFP_ATOMIC);
> +	err = virtqueue_add_buf(chan->vq, outsg, insg, req->tc, GFP_ATOMIC);
>  	if (err < 0) {
>  		if (err == -ENOSPC) {
>  			chan->ring_bufs_avail = 0;
> @@ -355,6 +377,7 @@ p9_virtio_zc_request(struct p9_client *c
>  	int in_nr_pages = 0, out_nr_pages = 0;
>  	struct page **in_pages = NULL, **out_pages = NULL;
>  	struct virtio_chan *chan = client->trans;
> +	struct scatterlist *insg = NULL, *outsg = NULL;
>  
>  	p9_debug(P9_DEBUG_TRANS, "virtio request\n");
>  
> @@ -402,6 +425,13 @@ req_retry_pinned:
>  	if (out_pages)
>  		out += pack_sg_list_p(chan->sg, out, VIRTQUEUE_NUM,
>  				      out_pages, out_nr_pages, uodata, outlen);
> +
> +	if (out) {
> +		sg_unset_end_markers(chan->sg, out-1);
> +		sg_mark_end(&chan->sg[out-1]);
> +		outsg = chan->sg;
> +	}
> +
>  	/*
>  	 * Take care of in data
>  	 * For example TREAD have 11.
> @@ -414,9 +446,13 @@ req_retry_pinned:
>  	if (in_pages)
>  		in += pack_sg_list_p(chan->sg, out + in, VIRTQUEUE_NUM,
>  				     in_pages, in_nr_pages, uidata, inlen);
> +	if (in) {
> +		sg_unset_end_markers(chan->sg+out, in-1);
> +		sg_mark_end(&chan->sg[out+in-1]);
> +		insg = chan->sg + out;
> +	}
>  
> -	err = virtqueue_add_buf(chan->vq, chan->sg, out, in, req->tc,
> -				GFP_ATOMIC);
> +	err = virtqueue_add_buf(chan->vq, outsg, insg, req->tc, GFP_ATOMIC);
>  	if (err < 0) {
>  		if (err == -ENOSPC) {
>  			chan->ring_bufs_avail = 0;
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ