netdev - Re: [PATCH v5 bpf-next 3/4] selftest/bpf: Implement sample UNIX domain socket iterator program.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c5b4e17b-97b3-061a-6956-6f21c5ad9581@fb.com>
Date:   Sun, 15 Aug 2021 11:10:49 -0700
From:   Yonghong Song <yhs@...com>
To:     Kuniyuki Iwashima <kuniyu@...zon.co.jp>,
        <andrii.nakryiko@...il.com>
CC:     <andrii@...nel.org>, <ast@...nel.org>, <benh@...zon.com>,
        <bpf@...r.kernel.org>, <daniel@...earbox.net>,
        <davem@...emloft.net>, <john.fastabend@...il.com>, <kafai@...com>,
        <kpsingh@...nel.org>, <kuba@...nel.org>, <kuni1840@...il.com>,
        <netdev@...r.kernel.org>, <songliubraving@...com>
Subject: Re: [PATCH v5 bpf-next 3/4] selftest/bpf: Implement sample UNIX
 domain socket iterator program.



On 8/13/21 5:21 PM, Kuniyuki Iwashima wrote:
> From:   Andrii Nakryiko <andrii.nakryiko@...il.com>
> Date:   Fri, 13 Aug 2021 16:25:53 -0700
>> On Thu, Aug 12, 2021 at 9:46 AM Kuniyuki Iwashima <kuniyu@...zon.co.jp> wrote:
>>>
>>> The iterator can output almost the same result compared to /proc/net/unix.
>>> The header line is aligned, and the Inode column uses "%8lu" because "%5lu"
>>> can be easily overflown.
>>>
>>>    # cat /sys/fs/bpf/unix
>>>    Num               RefCount Protocol Flags    Type St Inode    Path
>>
>> It's totally my OCD, but why the column name is not aligned with
>> values? I mean the "Inode" column. It's left aligned, but values
>> (numbers) are right-aligned? I'd fix that while applying, but I can't
>> apply due to selftests failures, so please take a look.
> 
> Ah, honestly, I've felt something strange about the column... will fix it!
> 
> 
>>
>>
>>>    ffff963c06689800: 00000002 00000000 00010000 0001 01    18697 private/defer
>>>    ffff963c7c979c00: 00000002 00000000 00000000 0001 01   598245 @Hello@...ld@
>>>
>>>    # cat /proc/net/unix
>>>    Num       RefCount Protocol Flags    Type St Inode Path
>>>    ffff963c06689800: 00000002 00000000 00010000 0001 01 18697 private/defer
>>>    ffff963c7c979c00: 00000002 00000000 00000000 0001 01 598245 @Hello@...ld@
>>>
>>> Note that this prog requires the patch ([0]) for LLVM code gen.  Thanks to
>>> Yonghong Song for analysing and fixing.
>>>
>>> [0] https://reviews.llvm.org/D107483
>>>
>>> Signed-off-by: Kuniyuki Iwashima <kuniyu@...zon.co.jp>
>>> Acked-by: Yonghong Song <yhs@...com>
>>> ---
>>
>> This selftests breaks test_progs-no_alu32 ([0], the error log is super
>> long and can freeze browser; it looks like an infinite loop and BPF
>> verifier just keeps reporting it until it runs out of 1mln
>> instructions or something). Please check what's going on there, I
>> can't land it as it is right now.
>>
>>    [0] https://github.com/kernel-patches/bpf/runs/3326071112?check_suite_focus=true#step:6:124288
>>
>>
>>>   tools/testing/selftests/bpf/README.rst        | 38 +++++++++
>>>   .../selftests/bpf/prog_tests/bpf_iter.c       | 16 ++++
>>>   tools/testing/selftests/bpf/progs/bpf_iter.h  |  8 ++
>>>   .../selftests/bpf/progs/bpf_iter_unix.c       | 77 +++++++++++++++++++
>>>   .../selftests/bpf/progs/bpf_tracing_net.h     |  4 +
>>>   5 files changed, 143 insertions(+)
>>>   create mode 100644 tools/testing/selftests/bpf/progs/bpf_iter_unix.c
>>>
>>
>> [...]
>>
>>> +                       /* The name of the abstract UNIX domain socket starts
>>> +                        * with '\0' and can contain '\0'.  The null bytes
>>> +                        * should be escaped as done in unix_seq_show().
>>> +                        */
>>> +                       int i, len;
>>> +
>>
>> no_alu32 variant probably isn't happy about using int for this, it
>> probably does << 32, >> 32 dance and loses track of actual value in
>> the loop. You can try using u64 instead.
> 
> Sorry, I missed the no_alu32 test.
> Changing int to __u64 fixed the error, thanks!

Indeed for no_alu32, the index has << 32 and >> 32, which makes
verifier *equivalent* register tracking not effective, see below:

       96:       r1 = r8 

       97:       r1 <<= 32 

       98:       r2 = r1 

       99:       r2 >>= 32 

      100:       if r2 > 109 goto +19 <LBB0_21> 

      101:       r1 s>>= 32 

      102:       if r1 s< 2 goto +17 <LBB0_21> 

      103:       r9 = 1 

      104:       r8 <<= 32 

      105:       r8 >>= 32

Because these shifting, r1/r2/r8 equivalence cannot be
easily established, so verifier ends with conservative
r8 and cannot verify program successfully.

Using __u64 for 'i' and 'len', the upper bound is directly
tested:
       98:       if r8 > 109 goto +16 <LBB0_21> 

       99:       if r8 < 2 goto +15 <LBB0_21>
and verifier is very happy with this.

> 
> 
>>
>>> +                       len = unix_sk->addr->len - sizeof(short);
>>> +
>>> +                       BPF_SEQ_PRINTF(seq, " @");
>>> +
>>> +                       /* unix_mkname() tests this upper bound. */
>>> +                       if (len < sizeof(struct sockaddr_un))
>>> +                               for (i = 1; i < len; i++)
>>
>> if you move above if inside the loop to break out of the loop, does it
>> change how Clang generates code?
>>
>> for (i = 1; i < len i++) {
>>      if (i >= sizeof(struct sockaddr_un))
>>          break;
>>      BPF_SEQ_PRINTF(...);
>> }
> 
> Yes, but there seems little defference.
> Which is preferable?
> 
> ---8<---
> before (for inside if) <- -> after (if inside loop)
>        96:	07 08 00 00 fe ff ff ff	r8 += -2			  |	; 			for (i = 1; i < len; i++) {
> ; 			if (len < sizeof(struct sockaddr_un))		  |	      97:	bf 81 00 00 00 00 00 00	r1 = r8
>        97:	25 08 10 00 6d 00 00 00	if r8 > 109 goto +16 <LBB0_21>	  |	      98:	07 01 00 00 fc ff ff ff	r1 += -4
> ; 				for (i = 1; i < len; i++)		  |	      99:	25 01 12 00 6b 00 00 00	if r1 > 107 goto +18 <LBB0_21>
>        98:	a5 08 0f 00 02 00 00 00	if r8 < 2 goto +15 <LBB0_21>	  |	     100:	07 08 00 00 fe ff ff ff	r8 += -2
>        99:	b7 09 00 00 01 00 00 00	r9 = 1				  |	     101:	b7 09 00 00 01 00 00 00	r9 = 1
>       100:	05 00 16 00 00 00 00 00	goto +22 <LBB0_18>		  |	     102:	b7 06 00 00 02 00 00 00	r6 = 2
> 									  |	     103:	05 00 17 00 00 00 00 00	goto +23 <LBB0_17>
> ...
>       111:	85 00 00 00 7e 00 00 00	call 126			  |	     113:	b4 05 00 00 08 00 00 00	w5 = 8
> ; 				for (i = 1; i < len; i++)		  |	     114:	85 00 00 00 7e 00 00 00	call 126
>       112:	07 09 00 00 01 00 00 00	r9 += 1				  |	; 			for (i = 1; i < len; i++) {
>       113:	ad 89 09 00 00 00 00 00	if r9 < r8 goto +9 <LBB0_18>	  |	     115:	25 08 02 00 6d 00 00 00	if r8 > 109 goto +2 <LBB0_21>
> 									  >	     116:	07 09 00 00 01 00 00 00	r9 += 1
> 									  >	; 			for (i = 1; i < len; i++) {
> 									  >	     117:	ad 89 09 00 00 00 00 00	if r9 < r8 goto +9 <LBB0_17>
> ---8<---
> 
> 
>>
>>
>>> +                                       BPF_SEQ_PRINTF(seq, "%c",
>>> +                                                      unix_sk->addr->name->sun_path[i] ?:
>>> +                                                      '@');
>>> +               }
>>> +       }
>>> +
>>> +       BPF_SEQ_PRINTF(seq, "\n");
>>> +
>>> +       return 0;
>>> +}
>>> diff --git a/tools/testing/selftests/bpf/progs/bpf_tracing_net.h b/tools/testing/selftests/bpf/progs/bpf_tracing_net.h
>>> index 3af0998a0623..eef5646ddb19 100644
>>> --- a/tools/testing/selftests/bpf/progs/bpf_tracing_net.h
>>> +++ b/tools/testing/selftests/bpf/progs/bpf_tracing_net.h
>>> @@ -5,6 +5,10 @@
>>>   #define AF_INET                        2
>>>   #define AF_INET6               10
>>>
>>> +#define __SO_ACCEPTCON         (1 << 16)
>>> +#define UNIX_HASH_SIZE         256
>>> +#define UNIX_ABSTRACT(unix_sk) (unix_sk->addr->hash < UNIX_HASH_SIZE)
>>> +
>>>   #define SOL_TCP                        6
>>>   #define TCP_CONGESTION         13
>>>   #define TCP_CA_NAME_MAX                16
>>> --
>>> 2.30.2
>>>