lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 16 Jun 2023 18:31:29 +0200
From:   Jesper Dangaard Brouer <jbrouer@...hat.com>
To:     Yunsheng Lin <linyunsheng@...wei.com>,
        Jesper Dangaard Brouer <jbrouer@...hat.com>,
        Alexander Duyck <alexander.duyck@...il.com>
Cc:     brouer@...hat.com, davem@...emloft.net, kuba@...nel.org,
        pabeni@...hat.com, netdev@...r.kernel.org,
        linux-kernel@...r.kernel.org,
        Lorenzo Bianconi <lorenzo@...nel.org>,
        Jesper Dangaard Brouer <hawk@...nel.org>,
        Ilias Apalodimas <ilias.apalodimas@...aro.org>,
        Eric Dumazet <edumazet@...gle.com>,
        Maryam Tahhan <mtahhan@...hat.com>, bpf <bpf@...r.kernel.org>
Subject: Re: [PATCH net-next v3 3/4] page_pool: introduce page_pool_alloc()
 API



On 16/06/2023 13.57, Yunsheng Lin wrote:
> On 2023/6/16 0:19, Jesper Dangaard Brouer wrote:
> 
> ...
> 
>> You have mentioned veth as the use-case. I know I acked adding page_pool
>> use-case to veth, for when we need to convert an SKB into an
>> xdp_buff/xdp-frame, but maybe it was the wrong hammer(?).
>> In this case in veth, the size is known at the page allocation time.
>> Thus, using the page_pool API is wasting memory.  We did this for
>> performance reasons, but we are not using PP for what is was intended
>> for.  We mostly use page_pool, because it an existing recycle return
>> path, and we were too lazy to add another alloc-type (see enum
>> xdp_mem_type).
>>
>> Maybe you/we can extend veth to use this dynamic size API, to show us
>> that this is API is a better approach.  I will signup for benchmarking
>> this (and coordinating with CC Maryam as she came with use-case we
>> improved on).
> 
> Thanks, let's find out if page pool is the right hammer for the
> veth XDP case.
> 
> Below is the change for veth using the new api in this patch.
> Only compile test as I am not familiar enough with veth XDP and
> testing environment for it.
> Please try it if it is helpful.
> 
> diff --git a/drivers/net/veth.c b/drivers/net/veth.c
> index 614f3e3efab0..8850394f1d29 100644
> --- a/drivers/net/veth.c
> +++ b/drivers/net/veth.c
> @@ -736,7 +736,7 @@ static int veth_convert_skb_to_xdp_buff(struct veth_rq *rq,
>          if (skb_shared(skb) || skb_head_is_locked(skb) ||
>              skb_shinfo(skb)->nr_frags ||
>              skb_headroom(skb) < XDP_PACKET_HEADROOM) {
> -               u32 size, len, max_head_size, off;
> +               u32 size, len, max_head_size, off, truesize, page_offset;
>                  struct sk_buff *nskb;
>                  struct page *page;
>                  int i, head_off;
> @@ -752,12 +752,15 @@ static int veth_convert_skb_to_xdp_buff(struct veth_rq *rq,
>                  if (skb->len > PAGE_SIZE * MAX_SKB_FRAGS + max_head_size)
>                          goto drop;
> 
> +               size = min_t(u32, skb->len, max_head_size);
> +               truesize = size;
> +
>                  /* Allocate skb head */
> -               page = page_pool_dev_alloc_pages(rq->page_pool);
> +               page = page_pool_dev_alloc(rq->page_pool, &page_offset, &truesize);

Maybe rename API to:

  addr = netmem_alloc(rq->page_pool, &truesize);

>                  if (!page)
>                          goto drop;
> 
> -               nskb = napi_build_skb(page_address(page), PAGE_SIZE);
> +               nskb = napi_build_skb(page_address(page) + page_offset, truesize);

IMHO this illustrates that API is strange/funky.
(I think this is what Alex Duyck is also pointing out).

This is the memory (virtual) address "pointer":
  addr = page_address(page) + page_offset

This is what napi_build_skb() takes as input. (I looked at other users 
of napi_build_skb() whom all give a mem ptr "va" as arg.)
So, why does your new API provide the "page" and not just the address?

As proposed above:
   addr = netmem_alloc(rq->page_pool, &truesize);

Maybe the API should be renamed, to indicate this isn't returning a "page"?
We have talked about the name "netmem" before.

>                  if (!nskb) {
>                          page_pool_put_full_page(rq->page_pool, page, true);
>                          goto drop;
> @@ -767,7 +770,6 @@ static int veth_convert_skb_to_xdp_buff(struct veth_rq *rq,
>                  skb_copy_header(nskb, skb);
>                  skb_mark_for_recycle(nskb);
> 
> -               size = min_t(u32, skb->len, max_head_size);
>                  if (skb_copy_bits(skb, 0, nskb->data, size)) {
>                          consume_skb(nskb);
>                          goto drop;
> @@ -782,14 +784,17 @@ static int veth_convert_skb_to_xdp_buff(struct veth_rq *rq,
>                  len = skb->len - off;
> 
>                  for (i = 0; i < MAX_SKB_FRAGS && off < skb->len; i++) {
> -                       page = page_pool_dev_alloc_pages(rq->page_pool);
> +                       size = min_t(u32, len, PAGE_SIZE);
> +                       truesize = size;
> +
> +                       page = page_pool_dev_alloc(rq->page_pool, &page_offset,
> +                                                  &truesize);
>                          if (!page) {
>                                  consume_skb(nskb);
>                                  goto drop;
>                          }
> 
> -                       size = min_t(u32, len, PAGE_SIZE);
> -                       skb_add_rx_frag(nskb, i, page, 0, size, PAGE_SIZE);
> +                       skb_add_rx_frag(nskb, i, page, page_offset, size, truesize);

Guess, this shows the opposite; that the "page" _is_ used by the 
existing API.

>                          if (skb_copy_bits(skb, off, page_address(page),
>                                            size)) {
>                                  consume_skb(nskb);

--Jesper

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ