[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1195676f-59a4-40d8-b459-d2668eb8c5fe@huawei.com>
Date: Mon, 18 Dec 2023 20:39:50 +0800
From: Yunsheng Lin <linyunsheng@...wei.com>
To: Mina Almasry <almasrymina@...gle.com>, <linux-kernel@...r.kernel.org>,
<netdev@...r.kernel.org>, <kvm@...r.kernel.org>,
<virtualization@...ts.linux.dev>
CC: "David S. Miller" <davem@...emloft.net>, Eric Dumazet
<edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, Paolo Abeni
<pabeni@...hat.com>, Stefan Hajnoczi <stefanha@...hat.com>, Stefano
Garzarella <sgarzare@...hat.com>, Jason Gunthorpe <jgg@...dia.com>,
Christian König <christian.koenig@....com>, Shakeel Butt
<shakeelb@...gle.com>, Willem de Bruijn <willemdebruijn.kernel@...il.com>
Subject: Re: [PATCH net-next v2 3/3] net: add netmem_t to skb_frag_t
On 2023/12/17 16:09, Mina Almasry wrote:
> Use netmem_t instead of page directly in skb_frag_t. Currently netmem_t
> is always a struct page underneath, but the abstraction allows efforts
> to add support for skb frags not backed by pages.
>
> There is unfortunately 1 instance where the skb_frag_t is assumed to be
> a bio_vec in kcm. For this case, add a debug assert that the skb frag is
> indeed backed by a page, and do a cast.
>
> Add skb[_frag]_fill_netmem_*() and skb_add_rx_frag_netmem() helpers so
> that the API can be used to create netmem skbs.
>
> Signed-off-by: Mina Almasry <almasrymina@...gle.com>
>
...
>
> -typedef struct bio_vec skb_frag_t;
> +typedef struct skb_frag {
> + struct netmem *bv_page;
bv_page -> bv_netmem?
> + unsigned int bv_len;
> + unsigned int bv_offset;
> +} skb_frag_t;
>
> /**
> * skb_frag_size() - Returns the size of a skb fragment
> @@ -2431,22 +2436,37 @@ static inline unsigned int skb_pagelen(const struct sk_buff *skb)
> return skb_headlen(skb) + __skb_pagelen(skb);
> }
>
...
> /**
> @@ -2462,10 +2482,10 @@ static inline void skb_len_add(struct sk_buff *skb, int delta)
> }
>
> /**
> - * __skb_fill_page_desc - initialise a paged fragment in an skb
> + * __skb_fill_netmem_desc - initialise a paged fragment in an skb
> * @skb: buffer containing fragment to be initialised
> * @i: paged fragment index to initialise
> - * @page: the page to use for this fragment
> + * @netmem: the netmem to use for this fragment
> * @off: the offset to the data with @page
> * @size: the length of the data
> *
> @@ -2474,10 +2494,13 @@ static inline void skb_len_add(struct sk_buff *skb, int delta)
> *
> * Does not take any additional reference on the fragment.
> */
> -static inline void __skb_fill_page_desc(struct sk_buff *skb, int i,
> - struct page *page, int off, int size)
> +static inline void __skb_fill_netmem_desc(struct sk_buff *skb, int i,
> + struct netmem *netmem, int off,
> + int size)
> {
> - __skb_fill_page_desc_noacc(skb_shinfo(skb), i, page, off, size);
> + struct page *page = netmem_to_page(netmem);
> +
> + __skb_fill_netmem_desc_noacc(skb_shinfo(skb), i, netmem, off, size);
>
> /* Propagate page pfmemalloc to the skb if we can. The problem is
> * that not all callers have unique ownership of the page but rely
> @@ -2485,7 +2508,21 @@ static inline void __skb_fill_page_desc(struct sk_buff *skb, int i,
> */
> page = compound_head(page);
> if (page_is_pfmemalloc(page))
> - skb->pfmemalloc = true;
> + skb->pfmemalloc = true;
Is it possible to introduce netmem_is_pfmemalloc() and netmem_compound_head()
for netmem, and have some built-time testing to ensure the implementation
is the same between page_is_pfmemalloc()/compound_head() and
netmem_is_pfmemalloc()/netmem_compound_head()? So that we can avoid the
netmem_to_page() as much as possible, especially in the driver.
> +}
> +
> +static inline void __skb_fill_page_desc(struct sk_buff *skb, int i,
> + struct page *page, int off, int size)
> +{
> + __skb_fill_netmem_desc(skb, i, page_to_netmem(page), off, size);
> +}
> +
...
> */
> static inline struct page *skb_frag_page(const skb_frag_t *frag)
> {
> - return frag->bv_page;
> + return netmem_to_page(frag->bv_page);
It seems we are not able to have a safe type protection for the above
function, as the driver may be able to pass a devmem frag as a param here,
and pass the returned page into the mm subsystem, and compiler is not able
to catch it when compiling.
> }
>
> /**
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 83af8aaeb893..053d220aa2f2 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -845,16 +845,24 @@ struct sk_buff *__napi_alloc_skb(struct napi_struct *napi, unsigned int len,
> }
> EXPORT_SYMBOL(__napi_alloc_skb);
>
...
> diff --git a/net/kcm/kcmsock.c b/net/kcm/kcmsock.c
> index 65d1f6755f98..5c46db045f4c 100644
> --- a/net/kcm/kcmsock.c
> +++ b/net/kcm/kcmsock.c
> @@ -636,9 +636,15 @@ static int kcm_write_msgs(struct kcm_sock *kcm)
> for (i = 0; i < skb_shinfo(skb)->nr_frags; i++)
> msize += skb_shinfo(skb)->frags[i].bv_len;
>
> + /* The cast to struct bio_vec* here assumes the frags are
> + * struct page based.
> + */
> + DEBUG_NET_WARN_ON_ONCE(
> + !skb_frag_page(&skb_shinfo(skb)->frags[0]));
It seems skb_frag_page() always return non-NULL in this patch, the above
checking seems unnecessary?
Powered by blists - more mailing lists