[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1195676f-59a4-40d8-b459-d2668eb8c5fe@huawei.com>
Date: Mon, 18 Dec 2023 20:39:50 +0800
From: Yunsheng Lin <linyunsheng@...wei.com>
To: Mina Almasry <almasrymina@...gle.com>, <linux-kernel@...r.kernel.org>,
	<netdev@...r.kernel.org>, <kvm@...r.kernel.org>,
	<virtualization@...ts.linux.dev>
CC: "David S. Miller" <davem@...emloft.net>, Eric Dumazet
	<edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, Paolo Abeni
	<pabeni@...hat.com>, Stefan Hajnoczi <stefanha@...hat.com>, Stefano
 Garzarella <sgarzare@...hat.com>, Jason Gunthorpe <jgg@...dia.com>,
	Christian König <christian.koenig@....com>, Shakeel Butt
	<shakeelb@...gle.com>, Willem de Bruijn <willemdebruijn.kernel@...il.com>
Subject: Re: [PATCH net-next v2 3/3] net: add netmem_t to skb_frag_t
On 2023/12/17 16:09, Mina Almasry wrote:
> Use netmem_t instead of page directly in skb_frag_t. Currently netmem_t
> is always a struct page underneath, but the abstraction allows efforts
> to add support for skb frags not backed by pages.
> 
> There is unfortunately 1 instance where the skb_frag_t is assumed to be
> a bio_vec in kcm. For this case, add a debug assert that the skb frag is
> indeed backed by a page, and do a cast.
> 
> Add skb[_frag]_fill_netmem_*() and skb_add_rx_frag_netmem() helpers so
> that the API can be used to create netmem skbs.
> 
> Signed-off-by: Mina Almasry <almasrymina@...gle.com>
> 
...
>  
> -typedef struct bio_vec skb_frag_t;
> +typedef struct skb_frag {
> +	struct netmem *bv_page;
bv_page -> bv_netmem?
> +	unsigned int bv_len;
> +	unsigned int bv_offset;
> +} skb_frag_t;
>  
>  /**
>   * skb_frag_size() - Returns the size of a skb fragment
> @@ -2431,22 +2436,37 @@ static inline unsigned int skb_pagelen(const struct sk_buff *skb)
>  	return skb_headlen(skb) + __skb_pagelen(skb);
>  }
>  
...
>  /**
> @@ -2462,10 +2482,10 @@ static inline void skb_len_add(struct sk_buff *skb, int delta)
>  }
>  
>  /**
> - * __skb_fill_page_desc - initialise a paged fragment in an skb
> + * __skb_fill_netmem_desc - initialise a paged fragment in an skb
>   * @skb: buffer containing fragment to be initialised
>   * @i: paged fragment index to initialise
> - * @page: the page to use for this fragment
> + * @netmem: the netmem to use for this fragment
>   * @off: the offset to the data with @page
>   * @size: the length of the data
>   *
> @@ -2474,10 +2494,13 @@ static inline void skb_len_add(struct sk_buff *skb, int delta)
>   *
>   * Does not take any additional reference on the fragment.
>   */
> -static inline void __skb_fill_page_desc(struct sk_buff *skb, int i,
> -					struct page *page, int off, int size)
> +static inline void __skb_fill_netmem_desc(struct sk_buff *skb, int i,
> +					  struct netmem *netmem, int off,
> +					  int size)
>  {
> -	__skb_fill_page_desc_noacc(skb_shinfo(skb), i, page, off, size);
> +	struct page *page = netmem_to_page(netmem);
> +
> +	__skb_fill_netmem_desc_noacc(skb_shinfo(skb), i, netmem, off, size);
>  
>  	/* Propagate page pfmemalloc to the skb if we can. The problem is
>  	 * that not all callers have unique ownership of the page but rely
> @@ -2485,7 +2508,21 @@ static inline void __skb_fill_page_desc(struct sk_buff *skb, int i,
>  	 */
>  	page = compound_head(page);
>  	if (page_is_pfmemalloc(page))
> -		skb->pfmemalloc	= true;
> +		skb->pfmemalloc = true;
Is it possible to introduce netmem_is_pfmemalloc() and netmem_compound_head()
for netmem, and have some built-time testing to ensure the implementation
is the same between page_is_pfmemalloc()/compound_head() and
netmem_is_pfmemalloc()/netmem_compound_head()? So that we can avoid the
netmem_to_page() as much as possible, especially in the driver.
> +}
> +
> +static inline void __skb_fill_page_desc(struct sk_buff *skb, int i,
> +					struct page *page, int off, int size)
> +{
> +	__skb_fill_netmem_desc(skb, i, page_to_netmem(page), off, size);
> +}
> +
...
>   */
>  static inline struct page *skb_frag_page(const skb_frag_t *frag)
>  {
> -	return frag->bv_page;
> +	return netmem_to_page(frag->bv_page);
It seems we are not able to have a safe type protection for the above
function, as the driver may be able to pass a devmem frag as a param here,
and pass the returned page into the mm subsystem, and compiler is not able
to catch it when compiling.
>  }
>  
>  /**
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 83af8aaeb893..053d220aa2f2 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -845,16 +845,24 @@ struct sk_buff *__napi_alloc_skb(struct napi_struct *napi, unsigned int len,
>  }
>  EXPORT_SYMBOL(__napi_alloc_skb);
>  
...
> diff --git a/net/kcm/kcmsock.c b/net/kcm/kcmsock.c
> index 65d1f6755f98..5c46db045f4c 100644
> --- a/net/kcm/kcmsock.c
> +++ b/net/kcm/kcmsock.c
> @@ -636,9 +636,15 @@ static int kcm_write_msgs(struct kcm_sock *kcm)
>  		for (i = 0; i < skb_shinfo(skb)->nr_frags; i++)
>  			msize += skb_shinfo(skb)->frags[i].bv_len;
>  
> +		/* The cast to struct bio_vec* here assumes the frags are
> +		 * struct page based.
> +		 */
> +		DEBUG_NET_WARN_ON_ONCE(
> +			!skb_frag_page(&skb_shinfo(skb)->frags[0]));
It seems skb_frag_page() always return non-NULL in this patch, the above
checking seems unnecessary?
Powered by blists - more mailing lists
 
