lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 14 Jul 2023 18:06:08 +0200
From: Daniel Borkmann <daniel@...earbox.net>
To: Andrii Nakryiko <andrii.nakryiko@...il.com>,
 Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc: ast@...nel.org, andrii@...nel.org, martin.lau@...ux.dev,
 razor@...ckwall.org, sdf@...gle.com, john.fastabend@...il.com,
 kuba@...nel.org, dxu@...uu.xyz, joe@...ium.io, toke@...nel.org,
 davem@...emloft.net, bpf@...r.kernel.org, netdev@...r.kernel.org
Subject: Re: [PATCH bpf-next v4 1/8] bpf: Add generic attach/detach/query API
 for multi-progs

On 7/11/23 8:51 PM, Andrii Nakryiko wrote:
> On Mon, Jul 10, 2023 at 5:23 PM Alexei Starovoitov
> <alexei.starovoitov@...il.com> wrote:
>>
>> On Mon, Jul 10, 2023 at 10:12:11PM +0200, Daniel Borkmann wrote:
>>> + *
>>> + *   struct bpf_mprog_entry *entry, *peer;
>>> + *   int ret;
>>> + *
>>> + *   // bpf_mprog user-side lock
>>> + *   // fetch active @entry from attach location
>>> + *   [...]
>>> + *   ret = bpf_mprog_attach(entry, [...]);
>>> + *   if (ret >= 0) {
>>> + *       peer = bpf_mprog_peer(entry);
>>> + *       if (bpf_mprog_swap_entries(ret))
>>> + *           // swap @entry to @peer at attach location
>>> + *       bpf_mprog_commit(entry);
>>> + *       ret = 0;
>>> + *   } else {
>>> + *       // error path, bail out, propagate @ret
>>> + *   }
>>> + *   // bpf_mprog user-side unlock
>>> + *
>>> + *  Detach case:
>>> + *
>>> + *   struct bpf_mprog_entry *entry, *peer;
>>> + *   bool release;
>>> + *   int ret;
>>> + *
>>> + *   // bpf_mprog user-side lock
>>> + *   // fetch active @entry from attach location
>>> + *   [...]
>>> + *   ret = bpf_mprog_detach(entry, [...]);
>>> + *   if (ret >= 0) {
>>> + *       release = ret == BPF_MPROG_FREE;
>>> + *       peer = release ? NULL : bpf_mprog_peer(entry);
>>> + *       if (bpf_mprog_swap_entries(ret))
>>> + *           // swap @entry to @peer at attach location
>>> + *       bpf_mprog_commit(entry);
>>> + *       if (release)
>>> + *           // free bpf_mprog_bundle
>>> + *       ret = 0;
>>> + *   } else {
>>> + *       // error path, bail out, propagate @ret
>>> + *   }
>>> + *   // bpf_mprog user-side unlock
>>
>> Thanks for the doc. It helped a lot.
>> And when it's contained like this it's easier to discuss api.
>> It seems bpf_mprog_swap_entries() is trying to abstract the error code
>> away, but BPF_MPROG_FREE leaks out and tcx_entry_needs_release()
>> captures it with extra miniq_active twist, which I don't understand yet.
>> bpf_mprog_peer() is also leaking a bit of implementation detail.
>> Can we abstract it further, like:
>>
>> ret = bpf_mprog_detach(entry, [...], &new_entry);
>> if (ret >= 0) {
>>     if (entry != new_entry)
>>       // swap @entry to @new_entry at attach location
>>     bpf_mprog_commit(entry);
>>     if (!new_entry)
>>       // free bpf_mprog_bundle
>> }
>> and make bpf_mprog_peer internal to mprog. It will also allow removing
>> BPF_MPROG_FREE vs SWAP distinction. peer is hidden.
>>     if (entry != new_entry)
>>        // update
>> also will be easier to read inside tcx code without looking into mprog details.

+1, agree, and I implemented this suggestion in the v5.

> I'm actually thinking if it's possible to simplify it even further.
> For example, do we even need a separate bpf_mprog_{attach,detach} and
> bpf_mprog_commit()? So far it seems like bpf_mprog_commit() is
> inevitable in case of success of attach/detach, so we might as well
> just do it as the last step of attach/detach operation.

It needs to be done after the pointers have been swapped by the mprog user.

> The only problem seems to be due to bpf_mprog interface doing this
> optimization of replacing stuff in place, if possible, and allowing
> the caller to not do the swap. How important is it to avoid that swap
> of a bpf_mprog_fp (pointer)? Seems pretty cheap (and relatively rare
> operation), so I wouldn't bother optimizing this.

I would like to keep it given e.g. when application comes up, fetches links
from bpffs and updates all its programs in place, then this is the replace
situation.

> So how about we just say that there is always a swap. Internally in
> bpf_mprog_bundle current entry is determined based on revision&1. We
> can have bpf_mprog_cur_entry() to return a proper pointer after
> commit. Or bpf_mprog_attach() can return proper new entry as output
> parameter, whichever is preferable.
> 
> As for BPF_MPROG_FREE. That seems like an unnecessary complication as
> well. Caller can just check bpf_mprog_total() quickly, and if it
> dropped to zero assume FREE. Unless there is something more subtle
> there?

Agree, some may want to keep an empty bpf_mprog, others may want to
free it. I implemented it this way. I removed all the BPF_MPROG_*
return codes.

> With the above, the interface will be much simpler, IMO. You just do
> bpf_mprog_attach/detach, and then swap pointer to new bpf_mprog_entry.
> Then you can check bpf_mprog_total() for zero, and clean up further,
> if necessary.
> 
> We assume the caller has a proper locking, so all the above should be non-racy.
> 
> BTW, combining commit with attach allows us to avoid that relatively
> big bpf_mprog_cp array on the stack as well, because we will be able
> to update bundle->cp_items in-place.
> 
> The only (I believe :) ) big assumption I'm making in all of the above
> is that commit is inevitable and we won't have a situation where we
> start attach, update fp/cpp, and then decide to abort instead of going
> for commit. Is this possible? Can we avoid it by careful checks
> upfront and doing attach as last step that cannot be undone?
> 
> P.S. I guess one bit that I might have simplified is that
> synchronize_rcu() + bpf_prog_put(), but I'm not sure exactly why we
> put prog after sync_rcu. But if it's really necessary (and I assume it

It is because users can still be inflight on the old mprog_entry so it
must come after the sync rcu where we drop ref for the delete case.

Thanks again,
Daniel

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ