[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAHsH6Gu2VgREhgiFwvT8OokuJEt7kxh7ifUnFqPMnwiKOeu4aw@mail.gmail.com>
Date: Sat, 3 Dec 2022 09:35:22 +0200
From: Eyal Birger <eyal.birger@...il.com>
To: Martin KaFai Lau <martin.lau@...ux.dev>
Cc: netdev@...r.kernel.org, bpf@...r.kernel.org,
linux-kselftest@...r.kernel.org, davem@...emloft.net,
edumazet@...gle.com, kuba@...nel.org, pabeni@...hat.com,
steffen.klassert@...unet.com, herbert@...dor.apana.org.au,
andrii@...nel.org, daniel@...earbox.net, nicolas.dichtel@...nd.com,
razor@...ckwall.org, mykolal@...com, ast@...nel.org,
song@...nel.org, yhs@...com, john.fastabend@...il.com,
kpsingh@...nel.org, sdf@...gle.com, haoluo@...gle.com,
jolsa@...nel.org, shuah@...nel.org, liuhangbin@...il.com,
lixiaoyan@...gle.com
Subject: Re: [PATCH bpf-next,v4 2/4] xfrm: interface: Add unstable helpers for
setting/getting XFRM metadata from TC-BPF
On Sat, Dec 3, 2022 at 5:55 AM Eyal Birger <eyal.birger@...il.com> wrote:
>
> Hi Martin,
>
> On Fri, Dec 2, 2022 at 11:27 PM Martin KaFai Lau <martin.lau@...ux.dev> wrote:
> >
> > On 12/2/22 12:49 PM, Eyal Birger wrote:
> > > On Fri, Dec 2, 2022 at 10:27 PM Martin KaFai Lau <martin.lau@...ux.dev> wrote:
> > >>
> > >> On 12/2/22 11:42 AM, Eyal Birger wrote:
> > >>> Hi Martin,
> > >>>
> > >>> On Fri, Dec 2, 2022 at 9:08 PM Martin KaFai Lau <martin.lau@...ux.dev> wrote:
> > >>>>
> > >>>> On 12/2/22 1:59 AM, Eyal Birger wrote:
> > >>>>> +__used noinline
> > >>>>> +int bpf_skb_set_xfrm_info(struct __sk_buff *skb_ctx,
> > >>>>> + const struct bpf_xfrm_info *from)
> > >>>>> +{
> > >>>>> + struct sk_buff *skb = (struct sk_buff *)skb_ctx;
> > >>>>> + struct metadata_dst *md_dst;
> > >>>>> + struct xfrm_md_info *info;
> > >>>>> +
> > >>>>> + if (unlikely(skb_metadata_dst(skb)))
> > >>>>> + return -EINVAL;
> > >>>>> +
> > >>>>> + md_dst = this_cpu_ptr(xfrm_md_dst);
> > >>>>> +
> > >>>>> + info = &md_dst->u.xfrm_info;
> > >>>>> +
> > >>>>> + info->if_id = from->if_id;
> > >>>>> + info->link = from->link;
> > >>>>> + skb_dst_force(skb);
> > >>>>> + info->dst_orig = skb_dst(skb);
> > >>>>> +
> > >>>>> + dst_hold((struct dst_entry *)md_dst);
> > >>>>> + skb_dst_set(skb, (struct dst_entry *)md_dst);
> > >>>>
> > >>>>
> > >>>> I may be missed something obvious and this just came to my mind,
> > >>>>
> > >>>> What stops cleanup_xfrm_interface_bpf() being run while skb is still holding the
> > >>>> md_dst?
> > >>>>
> > >>> Oh I think you're right. I missed this.
> > >>>
> > >>> In order to keep this implementation I suppose it means that the module would
> > >>> not be allowed to be removed upon use of this kfunc. but this could be seen as
> > >>> annoying from the configuration user experience.
> > >>>
> > >>> Alternatively the metadata dsts can be separately allocated from the kfunc,
> > >>> which is probably the simplest approach to maintain, so I'll work on that
> > >>> approach.
> > >>
> > >> If it means dst_alloc on every skb, it will not be cheap.
> > >>
> > >> Another option is to metadata_dst_alloc_percpu() once during the very first
> > >> bpf_skb_set_xfrm_info() call and the xfrm_md_dst memory will never be freed. It
> > >> is a tradeoff but likely the correct one. You can take a look at
> > >> bpf_get_skb_set_tunnel_proto().
> > >>
> > >
> > > Yes, I originally wrote this as a helper similar to the tunnel key
> > > helper which uses bpf_get_skb_set_tunnel_proto(), and when converting
> > > to kfuncs I kept the
> > > percpu implementation.
> > >
> > > However, the set tunnel key code is never unloaded. Whereas taking this
> > > approach here would mean that this memory would leak on each module reload
> > > iiuc.
> >
> > 'struct metadata_dst __percpu *xfrm_md_dst' cannot be in the xfrm module.
> > filter.c could be an option.
>
> Looking at it some more, won't the module reference taken by the kfunc btf
> guarantee that the module can't be unloaded while the kfunc is used by a
> loaded program?
>
> I tried this using a synthetic test attaching the program to a dummy interface
> and the module couldn't be unloaded while the program was loaded.
>
> In such case, is it possible for the memory to be freed while there are in-use
> percpu metadata dsts?
Decided to err on the side of caution and avoid the release of the percpu
dsts. It seems unlikely that the module could be unloaded while there are
in flight skbs pointing to the percpu memory, but it's safer not to rely on
this, and the cost is rather minimal, so I agree this is the correct tradeoff.
Eyal.
Powered by blists - more mailing lists