netdev - Re: [RFC PATCH 03/13] xen-netback: implement TX persistent grants

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <30CF0FF9-3B1D-48AF-AFB4-73E20C404357@neclab.eu>
Date:	Fri, 22 May 2015 10:24:39 +0000
From:	Joao Martins <Joao.Martins@...lab.eu>
To:	Wei Liu <wei.liu2@...rix.com>
CC:	"xen-devel@...ts.xenproject.org" <xen-devel@...ts.xenproject.org>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	"ian.campbell@...rix.com" <ian.campbell@...rix.com>,
	"david.vrabel@...rix.com" <david.vrabel@...rix.com>,
	"boris.ostrovsky@...cle.com" <boris.ostrovsky@...cle.com>,
	"konrad.wilk@...cle.com" <konrad.wilk@...cle.com>
Subject: Re: [RFC PATCH 03/13] xen-netback: implement TX persistent grants


On 19 May 2015, at 17:23, Wei Liu <wei.liu2@...rix.com> wrote:
> On Tue, May 12, 2015 at 07:18:27PM +0200, Joao Martins wrote:
>> Introduces persistent grants for TX path which follows similar code path
>> as the grant mapping.
>> 
>> It starts by checking if there's a persistent grant available for header
>> and frags grefs and if so setting it in tx_pgrants. If no persistent grant
>> is found in the tree for the header it will resort to grant copy (but
>> preparing the map ops and add them laster). For the frags it will use the
>                                     ^
>                                     later
> 
>> tree page pool, and in case of no pages it fallbacks to grant map/unmap
>> using mmap_pages. When skb destructor callback gets called we release the
>> slot and persistent grant within the callback to avoid waking up the
>> dealloc thread. As long as there are no unmaps to done the dealloc thread
>> will remain inactive.
>> 
> 
> This scheme looks complicated. Can we just only use one
> scheme at a time? What's the rationale for using this combined scheme?
> Maybe you're thinking about using a max_grants < ring_size to save
> memory?

Yes, my purpose was to allow a max_grants < ring_size to save amount of
memory mapped. I did a bulk transfer test with iperf and the max amount of
grants in tree was <160 TX gnts, without affecting the max performance;
tough using pktgen fills the tree completely.
The second reason is to handle the case for a (malicious?) frontend providing
more grefs than the max allowed in which I would fallback to grant map/unmap.

> 
> Only skim the patch. I will do detailed reviews after we're sure this is
> the right way to go.
> 
>> Results show an improvement of 46% (1.82 vs 1.24 Mpps, 64 pkt size)
>> measured with pktgen and up to over 48% (21.6 vs 14.5 Gbit/s) measured
>> with iperf (TCP) with 4 parallel flows 1 queue vif, DomU to Dom0.
>> Tests ran on a Intel Xeon E5-1650 v2 with HT disabled.
>> […]
>> int xenvif_init_queue(struct xenvif_queue *queue)
>> {
>> 	int err, i;
>> @@ -496,9 +524,23 @@ int xenvif_init_queue(struct xenvif_queue *queue)
>> 			  .ctx = NULL,
>> 			  .desc = i };
>> 		queue->grant_tx_handle[i] = NETBACK_INVALID_HANDLE;
>> +		queue->tx_pgrants[i] = NULL;
>> +	}
>> +
>> +	if (queue->vif->persistent_grants) {
>> +		err = init_persistent_gnt_tree(&queue->tx_gnts_tree,
>> +					       queue->tx_gnts_pages,
>> +					       XEN_NETIF_TX_RING_SIZE);
>> +		if (err)
>> +			goto err_disable;
>> 	}
>> 
>> 	return 0;
>> +
>> +err_disable:
>> +	netdev_err(queue->vif->dev, "Could not reserve tree pages.");
>> +	queue->vif->persistent_grants = 0;
> 
> You can just move the above two lines under `if (err)'.
> 
> Also see below.
In the next patch this is also common cleanup path. I did it this way to 
avoid moving this hunk around.


>> +	return 0;
>> }
>> 
>> void xenvif_carrier_on(struct xenvif *vif)
>> @@ -654,6 +696,10 @@ void xenvif_disconnect(struct xenvif *vif)
>> 		}
>> 
>> 		xenvif_unmap_frontend_rings(queue);
>> +
>> +		if (queue->vif->persistent_grants)
>> +			deinit_persistent_gnt_tree(&queue->tx_gnts_tree,
>> +						   queue->tx_gnts_pages);
> 
> If the init function fails on queue N (N>0) you now leak resources.
Correct. I should return -ENOMEM (on init) and not 0.


>> 	}
>> }
>> 
>> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
>> index 332e489..529d7c3 100644
>> --- a/drivers/net/xen-netback/netback.c
>> +++ b/drivers/net/xen-netback/netback.c
>> @@ -269,6 +269,11 @@ static inline unsigned long idx_to_kaddr(struct xenvif_queue *queue,
>> 	return (unsigned long)pfn_to_kaddr(idx_to_pfn(queue, idx));
>> }
>> 
>> +static inline void *page_to_kaddr(struct page *page)
>> +{
>> +	return pfn_to_kaddr(page_to_pfn(page));
>> +}
>> +
>> #define callback_param(vif, pending_idx) \
>> 	(vif->pending_tx_info[pending_idx].callback_struct)
>> 
>> @@ -299,6 +304,29 @@ static inline pending_ring_idx_t pending_index(unsigned i)
>> 	return i & (MAX_PENDING_REQS-1);
>> }
>> 
>> +/*  Creates a new persistent grant and add it to the tree.
>> + */
>> +static struct persistent_gnt *xenvif_pgrant_new(struct persistent_gnt_tree *tree,
>> +						struct gnttab_map_grant_ref *gop)
>> +{
>> +	struct persistent_gnt *persistent_gnt;
>> +
>> +	persistent_gnt = kmalloc(sizeof(*persistent_gnt), GFP_KERNEL);
> 
> xenvif_pgrant_new can be called in NAPI which runs in softirq context
> which doesn't allow you to sleep.

Silly mistake, I will fix this. The whole point of the tree page pool was that we
aren't allowed to sleep both here and in ndo_start_xmit (in patch #6).

>> […]
>> 
>> +/* Checks if there's a persistent grant available for gref and
>> + * if so, set it also in the tx_pgrants array that keeps the ones
>> + * in use.
>> + */
>> +static bool xenvif_tx_pgrant_available(struct xenvif_queue *queue,
>> +				       grant_ref_t ref, u16 pending_idx,
>> +				       bool *can_map)
>> +{
>> +	struct persistent_gnt_tree *tree = &queue->tx_gnts_tree;
>> +	struct persistent_gnt *persistent_gnt;
>> +	bool busy;
>> +
>> +	if (!queue->vif->persistent_grants)
>> +		return false;
>> +
>> +	persistent_gnt = get_persistent_gnt(tree, ref);
>> +
>> +	/* If gref is already in use we fallback, since it would
>> +	 * otherwise mean re-adding the same gref to the tree
>> +	 */
>> +	busy = IS_ERR(persistent_gnt);
>> +	if (unlikely(busy))
>> +		persistent_gnt = NULL;
>> +
> 
> Under what circumstance can we retrieve a already in use persistent
> grant? You seem to suggest this is a bug in RX case.

A guest could share try to share the same mapped page in multiple frags,
in which case I fallback to map/unmap. I think this is a limitation in
the way we manage the persistent gnts where we can only have a single
reference of a persistent grant inflight.

>> […]
>> 			MAX_PENDING_REQS);
>> 		index = pending_index(queue->dealloc_prod);
>> @@ -1691,7 +1939,10 @@ void xenvif_zerocopy_callback(struct ubuf_info *ubuf, bool zerocopy_success)
>> 		smp_wmb();
>> 		queue->dealloc_prod++;
>> 	} while (ubuf);
>> -	wake_up(&queue->dealloc_wq);
>> +	/* Wake up only when there are grants to unmap */
>> +	if (dealloc_prod_save != queue->dealloc_prod)
>> +		wake_up(&queue->dealloc_wq);
>> +
>> 	spin_unlock_irqrestore(&queue->callback_lock, flags);
>> 
>> 	if (likely(zerocopy_success))
>> @@ -1779,10 +2030,13 @@ int xenvif_tx_action(struct xenvif_queue *queue, int budget)
>> 
>> 	xenvif_tx_build_gops(queue, budget, &nr_cops, &nr_mops);
>> 
>> -	if (nr_cops == 0)
>> +	if (!queue->vif->persistent_grants &&
>> +	    nr_cops == 0)
> 
> You can just move nr_cops to previous line.
> 
> Wei.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html