linux-kernel - Re: bpf: incorrect passing infinate loop causing rcu detected stall during bpf_prog

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CACkBjsZ5iYQRc6_EREhKA1cg-dFtopSOKQhDo+6SgDnVrz+vcA@mail.gmail.com>
Date:   Mon, 30 Oct 2023 11:29:37 +0100
From:   Hao Sun <sunhao.th@...il.com>
To:     Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc:     Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        John Fastabend <john.fastabend@...il.com>,
        Andrii Nakryiko <andrii@...nel.org>,
        Martin KaFai Lau <martin.lau@...ux.dev>,
        Song Liu <song@...nel.org>,
        Yonghong Song <yonghong.song@...ux.dev>,
        KP Singh <kpsingh@...nel.org>,
        Stanislav Fomichev <sdf@...gle.com>,
        Hao Luo <haoluo@...gle.com>, Jiri Olsa <jolsa@...nel.org>,
        Mykola Lysenko <mykolal@...com>, Shuah Khan <shuah@...nel.org>,
        bpf <bpf@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: bpf: incorrect passing infinate loop causing rcu detected stall
 during bpf_prog_run()

On Sun, Oct 29, 2023 at 2:35 AM Alexei Starovoitov
<alexei.starovoitov@...il.com> wrote:
>
> On Fri, Oct 27, 2023 at 2:09 AM Hao Sun <sunhao.th@...il.com> wrote:
> >
> > Hi,
> >
> > The following C repro contains a bpf program that can cause rcu
> > stall/soft lockup during running in bpf_prog_run(). Seems the verifier
> > incorrectly passed the program with an infinite loop.
> >
> > C repro: https://pastebin.com/raw/ymzAxjeU
>
> Thanks for the report.
> Did you debug what exactly caused this bug?
> Are you planning to work on the fix?

This bug is really hard to debug. Here is a simplified view of
the original program:

loop:
0: r4 = r8
1: r1 = 0x1f
2: r8 -= -8
3: if r1 > r7 goto pc+1
4: r7 <<= r1         ; LSH r7 by 31
5: r5 = r0
6: r5 *= 2
7: if r5 < r0 goto pc+1
8: r8 s>>= 6
9: w7 &= w7       ; r7 = 0 after the first iter
10: r8 -= r7
11: r8 -= -1
12: if r4 >= 0x9 goto loop
13: exit

At runtime, r7 is updated to 0 through #4 and #9 at the first iteration,
so the following iteration will not take #3 to #4, so #3 can be ignored
after the first iteration. r0 is init by get_current_task, and r5 is always
smaller than r0 at runtime, so #7 to #8 will never run. So, the update
to r8 is only #2 and #11, which together add 9 to r8. Since r4 is set
to r8 at the start of each iteration, so it's an infinite loop at runtime.

Based on the log, the verifier keeps tracking #7 to #8 and to #9, and
at some point, the verifier prunes states and path from #7 to #9, so
it stops checking. The log is huge and hard to follow, the issue is likely
in pruning logic, but I don't have much knowledge about that part.

>
> > Verifier's log: https://pastebin.com/raw/thZDTFJc
>
> log is trimmed.

Full log: https://pastebin.com/raw/cTC8wmDH