lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <21cc97c9-1778-1b73-a05e-4ebbca39c861@kernel.org>
Date: Tue, 22 Aug 2023 20:13:17 +0200
From: Jesper Dangaard Brouer <hawk@...nel.org>
To: Yunsheng Lin <linyunsheng@...wei.com>,
 Jesper Dangaard Brouer <jbrouer@...hat.com>,
 Liang Chen <liangchen.linux@...il.com>, hawk@...nel.org, horms@...nel.org,
 davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org, pabeni@...hat.com
Cc: ilias.apalodimas@...aro.org, daniel@...earbox.net, ast@...nel.org,
 netdev@...r.kernel.org, Lorenzo Bianconi <lorenzo.bianconi@...hat.com>,
 Stanislav Fomichev <sdf@...gle.com>, Maryam Tahhan <mtahhan@...hat.com>
Subject: Re: [RFC PATCH net-next v3 0/2] net: veth: Optimizing page pool usage



On 22/08/2023 15.05, Jesper Dangaard Brouer wrote:
> 
> On 22/08/2023 14.24, Yunsheng Lin wrote:
>> On 2023/8/22 5:54, Jesper Dangaard Brouer wrote:
>>> On 21/08/2023 16.21, Jesper Dangaard Brouer wrote:
>>>>
>>>> On 16/08/2023 14.30, Liang Chen wrote:
>>>>> Page pool is supported for veth, but at the moment pages are not properly
>>>>> recyled for XDP_TX and XDP_REDIRECT. That prevents veth xdp from fully
>>>>> leveraging the advantages of the page pool. So this RFC patchset is mainly
>>>>> to make recycling work for those cases. With that in place, it can be
>>>>> further optimized by utilizing the napi skb cache. Detailed figures are
>>>>> presented in each commit message, and together they demonstrate a quite
>>>>> noticeable improvement.
>>>>>
>>>>
>>>> I'm digging into this code path today.
>>>>
>>>> I'm trying to extend this and find a way to support SKBs that used
>>>> kmalloc (skb->head_frag=0), such that we can remove the
>>>> skb_head_is_locked() check in veth_convert_skb_to_xdp_buff(), which will
>>>> allow more SKBs to avoid realloc.  As long as they have enough headroom,
>>>> which we can dynamically control for netdev TX-packets by adjusting
>>>> netdev->needed_headroom, e.g. when loading an XDP prog.
>>>>
>>>> I noticed netif_receive_generic_xdp() and bpf_prog_run_generic_xdp() can
>>>> handle SKB kmalloc (skb->head_frag=0).  Going though the code, I don't
>>>> think it is a bug that generic-XDP allows this.
>>
>> Is it possible to relaxe other checking too, and implement something like
>> pskb_expand_head() in xdp core if xdp core need to modify the data?
>>
> 
> Yes, I definitely hope (and plan) to relax other checks.
> 
> The XDP_PACKET_HEADROOM (256 bytes) check have IMHO become obsolete and
> wrong, as many drivers today use headroom 192 bytes for XDP (which we
> allowed).  Thus, there is not reason for veth to insist on this
> XDP_PACKET_HEADROOM limit.  Today XDP can handle variable headroom (due
> to these drivers).
> 
> 
>>
>>>>
>>>> Deep into this rabbit hole, I start to question our approach.
>>>>    - Perhaps the veth XDP approach for SKBs is wrong?
>>>>
>>>> The root-cause of this issue is that veth_xdp_rcv_skb() code path (that
>>>> handle SKBs) is calling XDP-native function "xdp_do_redirect()". I
>>>> question, why isn't it using "xdp_do_generic_redirect()"?
>>>> (I will jump into this rabbit hole now...)
>>
>> Is there any reason why xdp_do_redirect() can not handle the 
>> slab-allocated
>> data? Can we change the xdp_do_redirect() to handle slab-allocated
>> data, so that it can benefit other case beside veth too?
>>
> 
> I started coding up this, but realized that it was a wrong approach.
> 
> The xdp_do_redirect() call is for native-XDP with a proper xdp_buff.
> When dealing with SKBs we pretend is a xdp_buff, we have the API
> xdp_do_generic_redirect().  IMHO it is wrong to "steal" the packet-data
> from an SKB and in-order to use the native-XDP API xdp_do_redirect().
> In the use-cases I see, often the next layer will allocate a new SKB and
> attach the stolen packet-data , which is pure-waste as
> xdp_do_generic_redirect() keeps the SKB intact, so no new SKB allocs.
> 

Please see my RFC-v1 patchset[1] for a different approach, which avoids 
the realloc and page_pool usage all together (but also results in 
correct recycling for PP when realloc cannot be avoided).

[1] 
https://lore.kernel.org/all/169272709850.1975370.16698220879817216294.stgit@firesoul/

--Jesper



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ