lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Fri, 7 Oct 2022 15:45:21 -0700 From: sdf@...gle.com To: Daniel Borkmann <daniel@...earbox.net> Cc: Alexei Starovoitov <alexei.starovoitov@...il.com>, "Toke Høiland-Jørgensen" <toke@...hat.com>, bpf <bpf@...r.kernel.org>, Nikolay Aleksandrov <razor@...ckwall.org>, Alexei Starovoitov <ast@...nel.org>, Andrii Nakryiko <andrii@...nel.org>, Martin KaFai Lau <martin.lau@...ux.dev>, John Fastabend <john.fastabend@...il.com>, Joanne Koong <joannelkoong@...il.com>, Kumar Kartikeya Dwivedi <memxor@...il.com>, Joe Stringer <joe@...ium.io>, Network Development <netdev@...r.kernel.org> Subject: Re: [PATCH bpf-next 01/10] bpf: Add initial fd-based API to attach tc BPF programs On 10/07, Daniel Borkmann wrote: > On 10/7/22 8:59 PM, Alexei Starovoitov wrote: > > On Fri, Oct 7, 2022 at 10:20 AM Toke H�iland-J�rgensen > <toke@...hat.com> wrote: > [...] > > > > > I was thinking a little about how this might work; i.e., how can > the > > > > > kernel expose the required knobs to allow a system policy to be > > > > > implemented without program loading having to talk to anything > other > > > > > than the syscall API? > > > > > > > > > How about we only expose prepend/append in the prog attach UAPI, > and > > > > > then have a kernel function that does the sorting like: > > > > > > > > > int bpf_add_new_tcx_prog(struct bpf_prog *progs, size_t > num_progs, struct > > > > > bpf_prog *new_prog, bool append) > > > > > > > > > where the default implementation just appends/prepends to the > array in > > > > > progs depending on the value of 'appen'. > > > > > > > > > And then use the __weak linking trick (or maybe struct_ops with a > member > > > > > for TXC, another for XDP, etc?) to allow BPF to override the > function > > > > > wholesale and implement whatever ordering it wants? I.e., allow > it can > > > > > to just shift around the order of progs in the 'progs' array > whenever a > > > > > program is loaded/unloaded? > > > > > > > > > This way, a userspace daemon can implement any policy it wants by > just > > > > > attaching to that hook, and keeping things like how to express > > > > > dependencies as a userspace concern? > > > > > > > > What if we do the above, but instead of simple global 'attach > first/last', > > > > the default api would be: > > > > > > > > - attach before <target_fd> > > > > - attach after <target_fd> > > > > - attach before target_fd=-1 == first > > > > - attach after target_fd=-1 == last > > > > > > > > ? > > > > > > Hmm, the problem with that is that applications don't generally have > an > > > fd to another application's BPF programs; and obtaining them from an > ID > > > is a privileged operation (CAP_SYS_ADMIN). We could have it be "attach > > > before target *ID*" instead, which could work I guess? But then the > > > problem becomes that it's racy: the ID you're targeting could get > > > detached before you attach, so you'll need to be prepared to check > that > > > and retry; and I'm almost certain that applications won't test for > this, > > > so it'll just lead to hard-to-debug heisenbugs. Or am I being too > > > pessimistic here? > > > > I like Stan's proposal and don't see any issue with FD. > > It's good to gate specific sequencing with cap_sys_admin. > > Also for consistency the FD is better than ID. > > > > I also like systemd analogy with Before=, After=. > > systemd has a ton more ways to specify deps between Units, > > but none of them have absolute numbers (which is what priority is). > > The only bit I'd tweak in Stan's proposal is: > > - attach before <target_fd> > > - attach after <target_fd> > > - attach before target_fd=0 == first > > - attach after target_fd=0 == last > I think the before(), after() could work, but the target_fd I have my > doubts > that it will be practical. Maybe lets walk through a concrete real > example. app_a > and app_b shipped via container_a resp container_b. Both want to install > tc BPF > and we (operator/user) want to say that prog from app_b should only be > inserted > after the one from app_a, never run before; if no prog_a is installed, we > ofc just > run prog_b, but if prog_a is inserted, it must be before prog_b given the > latter > can only run after the former. How would we get to one anothers target > fd? One > could use the 0, but not if more programs sit before/after. This fd/id has to be definitely abstracted by the loader. With the program, we would ship some metadata like 'run_after:prog_a' for prog_b (where prog_a might be literal function name maybe?). However, this also depends on 'run_before:prog_b' in prog_a (in case it happens to be started after prog_b) :-/ So yeah, some central place might still be needed; in this case, Toke's suggestion on overriding this via bpf seems like the most flexible one. Or maybe libbpf can consult some /etc/bpf.init.d/ directory for those? Not sure if it's too much for libbpf or it's better done by the higher levels? I guess we can rely on the program names and then all we really need is some place to say 'prog X happens before Y' and for the loaders to interpret that. > To me it sounds reasonable to have the append mode as default mode/API, > and an advanced option to say 'I want to run as 2nd prog, but if something > is already attached as 2nd prog, shift all the others +1 in the array' > which > would relate to your above point, Stan, of being able to stick into any > place in the chain. Replying to your other email here: I'd still prefer, from the user side, to be able to stick my prog into any place for debugging. But you suggestion to shift others for +1 works for me. (although, not sure, for example, what happens if I want to shift right the program that's at position 65k; aka already last?) IMO, having explicit before/after+target is slightly better usability-wise than juggling priorities, but I'm fine with either way.
Powered by blists - more mailing lists