lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a10b6d19-232c-4b6d-bd71-eb3451675f64@gmail.com>
Date: Thu, 29 Feb 2024 14:22:16 +0100
From: Richard Gobert <richardbgobert@...il.com>
To: Eric Dumazet <edumazet@...gle.com>
Cc: davem@...emloft.net, kuba@...nel.org, pabeni@...hat.com,
 dsahern@...nel.org, shuah@...nel.org, liujian56@...wei.com,
 horms@...nel.org, aleksander.lobakin@...el.com, linyunsheng@...wei.com,
 therbert@...gle.com, netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
 linux-kselftest@...r.kernel.org
Subject: Re: [PATCH net-next 1/3] net: gro: set {inner_,}network_header in
 receive phase



Eric Dumazet wrote:
> 
> My intuition is that this patch has a high cost for normal GRO processing.
> SW-GRO is already a bottleneck on ARM cores in smart NICS.
> 
> I would suggest instead using parameters to give both the nhoff and thoff values
> this would avoid many conditionals in the fast path.
> 
> ->
> 
> INDIRECT_CALLABLE_SCOPE int udp6_gro_complete(struct sk_buff *skb, int
> nhoff, int thoff)
> {
>  const struct ipv6hdr *ipv6h = (const struct ipv6hdr *)(skb->data + nhoff);
>  struct udphdr *uh = (struct udphdr *)(skb->data + thoff);
> ...
> }
> 
> INDIRECT_CALLABLE_SCOPE int tcp6_gro_complete(struct sk_buff *skb, int
> nhoff, int thoff)
> {
>        const struct ipv6hdr *iph =  (const struct ipv6hdr *)(skb->data + nhoff);
>        struct tcphdr *th = (struct tcphdr *)(skb->data + thoff);
> 
> Why storing in skb fields things that really could be propagated more
> efficiently as function parameters ?

Hi Eric,
Thanks for the review!
 
I agree, the conditionals could be a problem and are actually not needed.
The third commit in this patch series introduces an optimisation for
ipv6/ipv4 using the correct {inner_}network_header. We can remove the
conditionals; I thought about multiple ways to do so. First, remove the
conditional in skb_gro_network_offset:
 
    static inline int skb_gro_network_offset(const struct sk_buff *skb)
    {
        const u32 mask = NAPI_GRO_CB(skb)->encap_mark - 1;
        return (skb_network_offset(skb) & mask) | (skb_inner_network_offset(skb) & ~mask);
    }
 
And for the conditionals in {inet,ipv6}_gro_receive I thought about two
ideas. The first is to move set_inner_network_header to encapsulation gro
functions like ipip_gro_receive, this way there's one less write (in
comparison to main) in these functions:

    static struct sk_buff *ipip_gro_receive(struct list_head *head,
                        struct sk_buff *skb)
    {
        ...
 
        NAPI_GRO_CB(skb)->encap_mark = 1;
        skb_set_inner_network_header(skb, skb_gro_offset(skb));
 
The second way is to always write to inner_network_header:

    INDIRECT_CALLABLE_SCOPE struct sk_buff *ipv6_gro_receive(struct list_head *head,
                                struct sk_buff *skb)
    {
        ...
        skb_set_inner_network_header(skb, off);
        ...
 
What do you think is better? I think the 1st is more beneficial for the
fast path.

We could then use the {inner_}network_header separation to optimise the
receive path, such as in the 3rd commit in this patch series.
 
Regards,
Richard

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ