[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAM_iQpVE4XG7SPAVBmV2UtqUANg3X-1ngY7COYC03NrT6JkZ+g@mail.gmail.com>
Date: Tue, 27 Apr 2021 09:36:01 -0700
From: Cong Wang <xiyou.wangcong@...il.com>
To: Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc: Linux Kernel Network Developers <netdev@...r.kernel.org>,
bpf <bpf@...r.kernel.org>,
Xiongchun Duan <duanxiongchun@...edance.com>,
Dongdong Wang <wangdongdong.6@...edance.com>,
Muchun Song <songmuchun@...edance.com>,
Cong Wang <cong.wang@...edance.com>,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Andrii Nakryiko <andrii@...nel.org>,
Martin KaFai Lau <kafai@...com>,
Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
Pedro Tammela <pctammela@...atatu.com>,
Jamal Hadi Salim <jhs@...atatu.com>
Subject: Re: [RFC Patch bpf-next] bpf: introduce bpf timer
On Mon, Apr 26, 2021 at 7:02 PM Alexei Starovoitov
<alexei.starovoitov@...il.com> wrote:
>
> On Mon, Apr 26, 2021 at 04:37:19PM -0700, Cong Wang wrote:
> > On Mon, Apr 26, 2021 at 4:05 PM Alexei Starovoitov
> > <alexei.starovoitov@...il.com> wrote:
> > >
> > > On Mon, Apr 26, 2021 at 4:00 PM Cong Wang <xiyou.wangcong@...il.com> wrote:
> > > >
> > > > Hi, Alexei
> > > >
> > > > On Wed, Apr 14, 2021 at 9:25 PM Alexei Starovoitov
> > > > <alexei.starovoitov@...il.com> wrote:
> > > > >
> > > > > On Wed, Apr 14, 2021 at 9:02 PM Cong Wang <xiyou.wangcong@...il.com> wrote:
> > > > > >
> > > > > > Then how do you prevent prog being unloaded when the timer callback
> > > > > > is still active?
> > > > >
> > > > > As I said earlier:
> > > > > "
> > > > > If prog refers such hmap as above during prog free the kernel does
> > > > > for_each_map_elem {if (elem->opaque) del_timer().}
> > > > > "
> > > >
> > > > I have discussed this with my colleagues, sharing timers among different
> > > > eBPF programs is a must-have feature for conntrack.
> > > >
> > > > For conntrack, we need to attach two eBPF programs, one on egress and
> > > > one on ingress. They share a conntrack table (an eBPF map), and no matter
> > > > we use a per-map or per-entry timer, updating the timer(s) could happen
> > > > on both sides, hence timers must be shared for both.
> > > >
> > > > So, your proposal we discussed does not work well for this scenario.
> > >
> > > why? The timer inside the map element will be shared just fine.
> > > Just like different progs can see the same map value.
> >
> > Hmm? In the above quotes from you, you suggested removing all the
> > timers installed by one eBPF program when it is freed, but they could be
> > still running independent of which program installs them.
>
> Right. That was before the office hours chat where we discussed an approach
> to remove timers installed by this particular prog only.
> The timers armed by other progs in the same map would be preserved.
>
> > In other words, timers are independent of other eBPF programs, so
> > they should not have an owner. With your proposal, the owner of a timer
> > is the program which contains the subprog (or callback) of the timer.
>
> right. so?
> How is this anything to do with "sharing timers among different eBPF programs"?
It matters a lot which program installs hence removes these timers,
because conceptually each connection inside a conntrack table does not
belong to any program, so are the timers associated with these
connections.
If we enforce this ownership, in case of conntrack the owner would be
the program which sees the connection first, which is pretty much
unpredictable. For example, if the ingress program sees a connection
first, it installs a timer for this connection, but the traffic is
bidirectional,
hence egress program needs this connection and its timer too, we
should not remove this timer when the ingress program is freed.
>From another point of view: maps and programs are both first-class
resources in eBPF, a timer is stored in a map and associated with a
program, so it is naturally a first-class resource too.
>
> > >
> > > Also if your colleagues have something to share they should be
> > > posting to the mailing list. Right now you're acting as a broken phone
> > > passing info back and forth and the knowledge gets lost.
> > > Please ask your colleagues to participate online.
> >
> > They are already in CC from the very beginning. And our use case is
> > public, it is Cilium conntrack:
> > https://github.com/cilium/cilium/blob/master/bpf/lib/conntrack.h
> >
> > The entries of the code are:
> > https://github.com/cilium/cilium/blob/master/bpf/bpf_lxc.c
> >
> > The maps for conntrack are:
> > https://github.com/cilium/cilium/blob/master/bpf/lib/conntrack_map.h
>
> If that's the only goal then kernel timers are not needed.
> cilium conntrack works well as-is.
We don't go back to why user-space cleanup is inefficient again,
do we? ;)
More importantly, although conntrack is our use case, we don't
design timers just for our case, obviously. Timers must be as flexible
to use as possible, to allow other future use cases.
Thanks.
Powered by blists - more mailing lists