linux-kernel - Re: [RFC] bpf: Rethinking BPF safety, BPF open-coded iterators, and possible improvements (runtime protection)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID:
 <AM6PR03MB5080F92AACF93F40872F53AB99FE2@AM6PR03MB5080.eurprd03.prod.outlook.com>
Date: Fri, 14 Feb 2025 20:53:43 +0000
From: Juntong Deng <juntong.deng@...look.com>
To: Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc: Alexei Starovoitov <ast@...nel.org>,
 Daniel Borkmann <daniel@...earbox.net>,
 John Fastabend <john.fastabend@...il.com>,
 Andrii Nakryiko <andrii@...nel.org>, Martin KaFai Lau
 <martin.lau@...ux.dev>, Eddy Z <eddyz87@...il.com>,
 Song Liu <song@...nel.org>, Yonghong Song <yonghong.song@...ux.dev>,
 KP Singh <kpsingh@...nel.org>, Stanislav Fomichev <sdf@...ichev.me>,
 Hao Luo <haoluo@...gle.com>, Jiri Olsa <jolsa@...nel.org>,
 Kumar Kartikeya Dwivedi <memxor@...il.com>, snorcht@...il.com,
 Christian Brauner <brauner@...nel.org>, bpf <bpf@...r.kernel.org>,
 LKML <linux-kernel@...r.kernel.org>
Subject: Re: [RFC] bpf: Rethinking BPF safety, BPF open-coded iterators, and
 possible improvements (runtime protection)

On 2025/2/8 02:40, Alexei Starovoitov wrote:
> On Tue, Feb 4, 2025 at 4:40 PM Juntong Deng <juntong.deng@...look.com> wrote:
>>
>> On 2025/2/4 23:59, Alexei Starovoitov wrote:
>>> On Tue, Feb 4, 2025 at 11:35 PM Juntong Deng <juntong.deng@...look.com> wrote:
>>>>
>>>> This discussion comes from the patch series open-coded BPF file
>>>> iterator, which was Nack-ed and thus ended [0].
>>>>
>>>> Thanks for the feedback from Christian, Linus, and Al, all very helpful.
>>>>
>>>> The problems encountered in this patch series may also be encountered in
>>>> other BPF open-coded iterators to be added in the future, or in other
>>>> BPF usage scenarios.
>>>>
>>>> So maybe this is a good opportunity for us to discuss all of this and
>>>> rethink BPF safety, BPF open coded iterators, and possible improvements.
>>>>
>>>> [0]:
>>>> https://lore.kernel.org/bpf/AM6PR03MB50801990BD93BFA2297A123599EC2@AM6PR03MB5080.eurprd03.prod.outlook.com/T/#t
>>>>
>>>> What do we expect from BPF safety?
>>>> ----------------------------------
>>>>
>>>> Christian points out the important fact that BPF programs can hold
>>>> references for a long time and cause weird issues.
>>>>
>>>> This is an inherent flaw in BPF. Since the addition of bpf_loop and
>>>> BPF open-code iterators, the myth that BPF is "absolutely" safe has
>>>> been broken.
>>>>
>>>> The BPF verifier is a static verifier and has no way of knowing how
>>>> long a BPF program will actually run.
>>>>
>>>> For example, the following BPF program can freeze your computer, but
>>>> can pass the BPF verifier smoothly.
>>>>
>>>> SEC("raw_tp/sched_switch")
>>>> int BPF_PROG(on_switch)
>>>> {
>>>>           struct bpf_iter_num it;
>>>>           int *v;
>>>>           bpf_iter_num_new(&it, 0, 100000);
>>>>           while ((v = bpf_iter_num_next(&it))) {
>>>>                   struct bpf_iter_num it2;
>>>>                   bpf_iter_num_new(&it2, 0, 100000);
>>>>                   while ((v = bpf_iter_num_next(&it2))) {
>>>>                           bpf_printk("BPF Bomb\n");
>>>>                   }
>>>>                   bpf_iter_num_destroy(&it2);
>>>>           }
>>>>           bpf_iter_num_destroy(&it);
>>>>           return 0;
>>>> }
>>>>
>>>> This BPF program runs a huge loop at each schedule.
>>>>
>>>> bpf_iter_num_new is a common iterator that we can use in almost any
>>>> context, including LSM, sched-ext, tracing, etc.
>>>>
>>>> We can run large, long loops on any critical code path and freeze the
>>>> system, since the BPF verifier has no way of knowing how long the
>>>> iteration will run.
>>>
>>> This is completely orthogonal to the issue that Christian explained.
>>
>> Thanks for your reply!
>>
>> Completely orthogonal? Sorry, I may have some misunderstandings.
> 
> ...
> 
>> program runs a huge loop at each schedule
> 
> You've discovered bpf iterators and said, rephrasing,
> "loops can take a long time" and concluded with:
> "This is an inherent flaw in BPF".
> 
> This kind of rhetoric is not helpful.
> People that wanted to abuse bpf powers could have done it 10 years
> ago without iterators, loops, etc.
> One could create a hash map and populate it with collisions
> and long per bucket link lists. Though we have random seed with enough
> persistence hashtab becomes slow.
> Then just do bpf_map_lookup_elem() from the prog.
> This was a known issue that is gradually being fixed.
> 

Sorry for my inappropriate expression.

Actually I just wanted to give an example to show that the problem has
existed for a long time and exists in other iterators as well...

Sorry for using "inherent flaw in BPF", I should try to help fix it.

>> Could you please share a link to the patch? I am curious how we can
>> fix this.
> 
> There is no "fix" for the iterator. There is no single patch either.
> The issues were discussed over many _years_ in LPC and LSFMM.
> Exception logic was a step to fixing it.
> Now we will do "exceptions part 2" or will rip out exceptions completely
> and go with "fast execute" approach.
> When either approach works we can add a watchdog (and other mechanisms)
> to cancel program execution.
> Unlike user space there is no easy way to sigkill bpf prog.
> We have to free up all resources cleanly.
> 

I sent a proof-of-concept patch series [0] that implements low-overhead,
non-intrusive runtime acquire/release tracking.

By replacing the address of the CALL instruction during JIT, BPF runtime
hooks can be implemented.

I hope this patch series will help with the watchdog and resource
auto-release issues.

[0]: 
https://lore.kernel.org/bpf/AM6PR03MB5080513BFAEB54A93CC70D4399FE2@AM6PR03MB5080.eurprd03.prod.outlook.com/T/#u

>> Yes, I am willing to help, so I included a "Possible improvements"
>> section.
> 
> With rants like "inherent flaw in BPF" it's hard to take
> your offer of help seriously.
> 
>> I am also working on another patch about filters that we discussed
>> earlier, although it still needs some time.
> 
> Pls focus on landing that first.