lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Wed, 5 Oct 2022 20:21:20 +0200 From: Daniel Borkmann <daniel@...earbox.net> To: sdf@...gle.com Cc: bpf@...r.kernel.org, razor@...ckwall.org, ast@...nel.org, andrii@...nel.org, martin.lau@...ux.dev, john.fastabend@...il.com, joannelkoong@...il.com, memxor@...il.com, toke@...hat.com, joe@...ium.io, netdev@...r.kernel.org Subject: Re: [PATCH bpf-next 01/10] bpf: Add initial fd-based API to attach tc BPF programs On 10/5/22 7:56 PM, sdf@...gle.com wrote: > On 10/05, Daniel Borkmann wrote: >> On 10/5/22 2:55 AM, sdf@...gle.com wrote: >> > On 10/05, Daniel Borkmann wrote: >> [...] >> > >> > The series looks exciting, haven't had a chance to look deeply, will try >> > to find some time this week. > >> Great, thanks! > >> > We've chatted briefly about priority during the talk, let's maybe discuss >> > it here more? >> > >> > I, as a user, still really have no clue about what priority to use. >> > We have this problem at tc, and we'll seemingly have the same problem >> > here? I guess it's even more relevant in k8s because internally at G we >> > can control the users. >> > >> > Is it worth at least trying to provide some default bands / guidance? >> > >> > For example, having SEC('tc/ingress') receive attach_priority=124 by >> > default? Maybe we can even have something like 'tc/ingress_first' get >> > attach_priority=1 and 'tc/ingress_last' with attach_priority=254? >> > (the names are arbitrary, we can do something better) >> > >> > ingress_first/ingress_last can be used by some monitoring jobs. The rest >> > can use default 124. If somebody really needs a custom priority, then they >> > can manually use something around 124/2 if they need to trigger before the >> > 'default' priority or 124+124/2 if they want to trigger after? >> > >> > Thoughts? Is it worth it? Do we care? > >> I think guidance is needed, yes, I can add a few paragraphs to the libbpf >> header file where we also have the tc BPF link API. I had a brief discussion >> around this also with developers from datadog as they also use the infra >> via tc BPF. Overall, its a hard problem, and I don't think there's a good >> generic solution. The '*_last' is implied by prio=0, so that kernel auto- >> allocates it, and for libbpf we could add an API for it where the user >> does not need to specify prio specifically. The 'appending' is reasonable >> to me given if an application explicitly requests to be added as first >> (and e.g. enforces policy through tc BPF), but some other 3rd party application >> prepends itself as first, it can bypass the former, which would be too easy >> to shoot yourself in the foot. Overall the issue in tc land is that ordering >> matters, skb packet data could be mangled (e.g. IPs NATed), skb fields can >> be mangled, and we can have redirect actions (dev A vs. B); the only way I'd >> see were this is possible if somewhat verifier can annotate the prog when >> it didn't observe any writes to skb, and no redirect was in play. Then you've >> kind of replicated the constraints similar to tracing where the attachment >> can say that ordering doesn't matter if all the progs are in same style. >> Otherwise, explicit corporation is needed as is today with rest of tc (or >> as Toke did in libxdp) with multi-attach. In the specific case I mentioned >> at LPC, it can be made to work given one of the two is only observing traffic >> at the layer, e.g. it could get prepended if there is guarantee that all >> return codes are tc_act_unspec so that there is no bypass and then you'll >> see all traffic or appended to see only traffic which made it past the >> policy. So it all depends on the applications installing programs, but to >> solve it generically is not possible given ordering and conflicting actions. >> So, imho, an _append() API for libbpf can be added along with guidance for >> developers when to use _append() vs explicit prio. > > Agreed, it's a hard problem to solve, especially from the kernel side. > Ideally, as Toke mentions on the side thread, there should be some kind > of system daemon or some other place where the ordering is described. > But let's start with at least some guidance on the current prio. > > Might be also a good idea to narrow down the prio range to 0-65k for > now? Maybe in the future we'll have some special PRIO_MONITORING_BEFORE_ALL > and PRIO_MONITORING_AFTER_ALL that trigger regardless of TC_ACT_UNSPEC? > I agree with Toke that it's another problem with the current action based > chains that's worth solving somehow (compared to, say, cgroup programs). Makes sense, I'll restrict the range so there's headroom for future extensions, the mentioned 0-65k looks very reasonable to me. Thanks, Daniel
Powered by blists - more mailing lists