netdev - Re: [PATCH 1/2] xen/netback: don't do grant copy across page boundary

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f519a2d3-6662-35a2-b295-1825924affa8@suse.com>
Date:   Mon, 27 Mar 2023 12:07:01 +0200
From:   Juergen Gross <jgross@...e.com>
To:     Jan Beulich <jbeulich@...e.com>
Cc:     Wei Liu <wei.liu@...nel.org>, Paul Durrant <paul@....org>,
        "David S. Miller" <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>,
        Jakub Kicinski <kuba@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>,
        xen-devel@...ts.xenproject.org, stable@...r.kernel.org,
        linux-kernel@...r.kernel.org, netdev@...r.kernel.org
Subject: Re: [PATCH 1/2] xen/netback: don't do grant copy across page boundary

On 27.03.23 11:49, Jan Beulich wrote:
> On 27.03.2023 10:36, Juergen Gross wrote:
>> Fix xenvif_get_requests() not to do grant copy operations across local
>> page boundaries. This requires to double the maximum number of copy
>> operations per queue, as each copy could now be split into 2.
>>
>> Make sure that struct xenvif_tx_cb doesn't grow too large.
>>
>> Cc: stable@...r.kernel.org
>> Fixes: ad7f402ae4f4 ("xen/netback: Ensure protocol headers don't fall in the non-linear area")
>> Signed-off-by: Juergen Gross <jgross@...e.com>
>> ---
>>   drivers/net/xen-netback/common.h  |  2 +-
>>   drivers/net/xen-netback/netback.c | 25 +++++++++++++++++++++++--
>>   2 files changed, 24 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
>> index 3dbfc8a6924e..1fcbd83f7ff2 100644
>> --- a/drivers/net/xen-netback/common.h
>> +++ b/drivers/net/xen-netback/common.h
>> @@ -166,7 +166,7 @@ struct xenvif_queue { /* Per-queue data for xenvif */
>>   	struct pending_tx_info pending_tx_info[MAX_PENDING_REQS];
>>   	grant_handle_t grant_tx_handle[MAX_PENDING_REQS];
>>   
>> -	struct gnttab_copy tx_copy_ops[MAX_PENDING_REQS];
>> +	struct gnttab_copy tx_copy_ops[2 * MAX_PENDING_REQS];
>>   	struct gnttab_map_grant_ref tx_map_ops[MAX_PENDING_REQS];
>>   	struct gnttab_unmap_grant_ref tx_unmap_ops[MAX_PENDING_REQS];
>>   	/* passed to gnttab_[un]map_refs with pages under (un)mapping */
>> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
>> index 1b42676ca141..111c179f161b 100644
>> --- a/drivers/net/xen-netback/netback.c
>> +++ b/drivers/net/xen-netback/netback.c
>> @@ -334,6 +334,7 @@ static int xenvif_count_requests(struct xenvif_queue *queue,
>>   struct xenvif_tx_cb {
>>   	u16 copy_pending_idx[XEN_NETBK_LEGACY_SLOTS_MAX + 1];
>>   	u8 copy_count;
>> +	u32 split_mask;
>>   };
>>   
>>   #define XENVIF_TX_CB(skb) ((struct xenvif_tx_cb *)(skb)->cb)
>> @@ -361,6 +362,8 @@ static inline struct sk_buff *xenvif_alloc_skb(unsigned int size)
>>   	struct sk_buff *skb =
>>   		alloc_skb(size + NET_SKB_PAD + NET_IP_ALIGN,
>>   			  GFP_ATOMIC | __GFP_NOWARN);
>> +
>> +	BUILD_BUG_ON(sizeof(*XENVIF_TX_CB(skb)) > sizeof(skb->cb));
>>   	if (unlikely(skb == NULL))
>>   		return NULL;
>>   
>> @@ -396,11 +399,13 @@ static void xenvif_get_requests(struct xenvif_queue *queue,
>>   	nr_slots = shinfo->nr_frags + 1;
>>   
>>   	copy_count(skb) = 0;
>> +	XENVIF_TX_CB(skb)->split_mask = 0;
>>   
>>   	/* Create copy ops for exactly data_len bytes into the skb head. */
>>   	__skb_put(skb, data_len);
>>   	while (data_len > 0) {
>>   		int amount = data_len > txp->size ? txp->size : data_len;
>> +		bool split = false;
>>   
>>   		cop->source.u.ref = txp->gref;
>>   		cop->source.domid = queue->vif->domid;
>> @@ -413,6 +418,13 @@ static void xenvif_get_requests(struct xenvif_queue *queue,
>>   		cop->dest.u.gmfn = virt_to_gfn(skb->data + skb_headlen(skb)
>>   				               - data_len);
>>   
>> +		/* Don't cross local page boundary! */
>> +		if (cop->dest.offset + amount > XEN_PAGE_SIZE) {
>> +			amount = XEN_PAGE_SIZE - cop->dest.offset;
>> +			XENVIF_TX_CB(skb)->split_mask |= 1U << copy_count(skb);
> 
> Maybe worthwhile to add a BUILD_BUG_ON() somewhere to make sure this
> shift won't grow too large a shift count. The number of slots accepted
> could conceivably be grown past XEN_NETBK_LEGACY_SLOTS_MAX (i.e.
> XEN_NETIF_NR_SLOTS_MIN) at some point.

This is basically impossible due to the size restriction of struct
xenvif_tx_cb.

> 
>> +			split = true;
>> +		}
>> +
>>   		cop->len = amount;
>>   		cop->flags = GNTCOPY_source_gref;
>>   
>> @@ -420,7 +432,8 @@ static void xenvif_get_requests(struct xenvif_queue *queue,
>>   		pending_idx = queue->pending_ring[index];
>>   		callback_param(queue, pending_idx).ctx = NULL;
>>   		copy_pending_idx(skb, copy_count(skb)) = pending_idx;
>> -		copy_count(skb)++;
>> +		if (!split)
>> +			copy_count(skb)++;
>>   
>>   		cop++;
>>   		data_len -= amount;
>> @@ -441,7 +454,8 @@ static void xenvif_get_requests(struct xenvif_queue *queue,
>>   			nr_slots--;
>>   		} else {
>>   			/* The copy op partially covered the tx_request.
>> -			 * The remainder will be mapped.
>> +			 * The remainder will be mapped or copied in the next
>> +			 * iteration.
>>   			 */
>>   			txp->offset += amount;
>>   			txp->size -= amount;
>> @@ -539,6 +553,13 @@ static int xenvif_tx_check_gop(struct xenvif_queue *queue,
>>   		pending_idx = copy_pending_idx(skb, i);
>>   
>>   		newerr = (*gopp_copy)->status;
>> +
>> +		/* Split copies need to be handled together. */
>> +		if (XENVIF_TX_CB(skb)->split_mask & (1U << i)) {
>> +			(*gopp_copy)++;
>> +			if (!newerr)
>> +				newerr = (*gopp_copy)->status;
>> +		}
> 
> It isn't guaranteed that a slot may be split only once, is it? Assuming a

I think it is guaranteed.

No slot can cover more than XEN_PAGE_SIZE bytes due to the grants being
restricted to that size. There is no way how such a data packet could cross
2 page boundaries.

In the end the problem isn't the copies for the linear area not crossing
multiple page boundaries, but the copies for a single request slot not
doing so. And this can't happen IMO.

> near-64k packet with all tiny non-primary slots, that'll cause those tiny
> slots to all be mapped, but due to
> 
> 		if (ret >= XEN_NETBK_LEGACY_SLOTS_MAX - 1 && data_len < txreq.size)
> 			data_len = txreq.size;
> 
> will, afaict, cause a lot of copying for the primary slot. Therefore I
> think you need a loop here, not just an if(). Plus tx_copy_ops[]'es
> dimension also looks to need further growing to accommodate this. Or
> maybe not - at least the extreme example given would still be fine; more
> generally packets being limited to below 64k means 2*16 slots would
> suffice at one end of the scale, while 2*MAX_PENDING_REQS would at the
> other end (all tiny, including the primary slot). What I haven't fully
> convinced myself of is whether there might be cases in the middle which
> are yet worse.

See above reasoning. I think it is okay, but maybe I'm missing something.


Juergen


Download attachment "OpenPGP_0xB0DE9DD628BF132F.asc" of type "application/pgp-keys" (3099 bytes)

Download attachment "OpenPGP_signature" of type "application/pgp-signature" (496 bytes)