linux-kernel - Re: [PATCH net-next 2/3] net: use skb_for_each

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20210409142808.11b479ea@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>
Date:   Fri, 9 Apr 2021 14:28:08 -0700
From:   Jakub Kicinski <kuba@...nel.org>
To:     Matteo Croce <mcroce@...ux.microsoft.com>
Cc:     netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
        "David S. Miller" <davem@...emloft.net>,
        Julia Lawall <julia.lawall@...ia.fr>
Subject: Re: [PATCH net-next 2/3] net: use skb_for_each_frag() helper where
 possible

On Fri, 9 Apr 2021 22:44:50 +0200 Matteo Croce wrote:
> > What pops to mind (although quite nit picky) is the question if the
> > assembly changes much between driver which used to cache nr_frags and
> > now always going skb_shinfo(skb)->nr_frags? It's a relatively common
> > pattern.  
> 
> Since skb_shinfo() is a macro and skb_end_pointer() a static inline,
> it should be the same, but I was curious to check so, this is a diff
> between the following snippet before and afer the macro:
> 
> int frags = skb_shinfo(skb)->nr_frags;
> int i;
> for (i = 0; i < frags; i++)
>     kfree(skb->frags[i]);
> 
>  1 file changed, 8 insertions(+), 7 deletions(-)
> 
> --- ins1.s 2021-04-09 22:35:59.384523865 +0200
> +++ ins2.s 2021-04-09 22:36:08.132594737 +0200
> @@ -1,26 +1,27 @@
>  iter:
>          movsx   rax, DWORD PTR [rdi+16]
>          mov     rdx, QWORD PTR [rdi+8]
>          mov     eax, DWORD PTR [rdx+rax]
>          test    eax, eax
>          jle     .L6
>          push    rbp
> -        sub     eax, 1
> +        mov     rbp, rdi
>          push    rbx
> -        lea     rbp, [rdi+32+rax*8]
> -        lea     rbx, [rdi+24]
> +        xor     ebx, ebx
>          sub     rsp, 8
>  .L3:
> -        mov     rdi, QWORD PTR [rbx]
> -        add     rbx, 8
> +        mov     rdi, QWORD PTR [rbp+24+rbx*8]
> +        add     rbx, 1
>          call    kfree
> -        cmp     rbx, rbp
> -        jne     .L3
> +        movsx   rax, DWORD PTR [rbp+16]
> +        mov     rdx, QWORD PTR [rbp+8]
> +        cmp     DWORD PTR [rdx+rax], ebx
> +        jg      .L3
>          add     rsp, 8
>          xor     eax, eax
>          pop     rbx
>          pop     rbp
>          ret
>  .L6:
>          xor     eax, eax
>      for (i = 0; i < frags; i++)    ret
> 

So looks like before compiler generated:

	end = &frags[nfrags]
	for (ptr = &frag[0]; ptr < end; ptr++)

and now it has to use the actual value of i, read nfrags in the loop
each time and compare it to i.

That makes sense, since it can't prove kfree() doesn't change nr_frags.

IDK if we care, but at least commit message should mention this.