[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <164399157371.4980.14890218612337167330@kwain>
Date: Fri, 04 Feb 2022 17:19:33 +0100
From: Antoine Tenart <atenart@...nel.org>
To: Daniel Borkmann <daniel@...earbox.net>, davem@...emloft.net,
kuba@...nel.org
Cc: netdev@...r.kernel.org, vladbu@...dia.com, pabeni@...hat.com,
pshelar@....org, wenxu@...oud.cn
Subject: Re: [PATCH net 1/2] net: do not keep the dst cache when uncloning an skb dst and its metadata
Quoting Daniel Borkmann (2022-02-04 13:33:20)
> On 2/2/22 2:44 PM, Antoine Tenart wrote:
> > Quoting Daniel Borkmann (2022-02-02 13:13:30)
> >> On 2/2/22 12:01 PM, Antoine Tenart wrote:
> >>> When uncloning an skb dst and its associated metadata a new dst+metadata
> >>> is allocated and the tunnel information from the old metadata is copied
> >>> over there.
> >>>
> >>> The issue is the tunnel metadata has references to cached dst, which are
> >>> copied along the way. When a dst+metadata refcount drops to 0 the
> >>> metadata is freed including the cached dst entries. As they are also
> >>> referenced in the initial dst+metadata, this ends up in UaFs.
> >>>
> >>> In practice the above did not happen because of another issue, the
> >>> dst+metadata was never freed because its refcount never dropped to 0
> >>> (this will be fixed in a subsequent patch).
> >>>
> >>> Fix this by initializing the dst cache after copying the tunnel
> >>> information from the old metadata to also unshare the dst cache.
> >>>
> >>> Fixes: d71785ffc7e7 ("net: add dst_cache to ovs vxlan lwtunnel")
> >>> Cc: Paolo Abeni <pabeni@...hat.com>
> >>> Reported-by: Vlad Buslov <vladbu@...dia.com>
> >>> Tested-by: Vlad Buslov <vladbu@...dia.com>
> >>> Signed-off-by: Antoine Tenart <atenart@...nel.org>
> >>> ---
> >>> include/net/dst_metadata.h | 13 ++++++++++++-
> >>> 1 file changed, 12 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/include/net/dst_metadata.h b/include/net/dst_metadata.h
> >>> index 14efa0ded75d..c8f8b7b56bba 100644
> >>> --- a/include/net/dst_metadata.h
> >>> +++ b/include/net/dst_metadata.h
> >>> @@ -110,8 +110,8 @@ static inline struct metadata_dst *tun_rx_dst(int md_size)
> >>> static inline struct metadata_dst *tun_dst_unclone(struct sk_buff *skb)
> >>> {
> >>> struct metadata_dst *md_dst = skb_metadata_dst(skb);
> >>> - int md_size;
> >>> struct metadata_dst *new_md;
> >>> + int md_size, ret;
> >>>
> >>> if (!md_dst || md_dst->type != METADATA_IP_TUNNEL)
> >>> return ERR_PTR(-EINVAL);
> >>> @@ -123,6 +123,17 @@ static inline struct metadata_dst *tun_dst_unclone(struct sk_buff *skb)
> >>>
> >>> memcpy(&new_md->u.tun_info, &md_dst->u.tun_info,
> >>> sizeof(struct ip_tunnel_info) + md_size);
> >>> +#ifdef CONFIG_DST_CACHE
> >>> + ret = dst_cache_init(&new_md->u.tun_info.dst_cache, GFP_ATOMIC);
> >>> + if (ret) {
> >>> + /* We can't call metadata_dst_free directly as the still shared
> >>> + * dst cache would be released.
> >>> + */
> >>> + kfree(new_md);
> >>> + return ERR_PTR(ret);
> >>> + }
> >>> +#endif
> >>
> >> Could you elaborate (e.g. also in commit message) how this interacts
> >> or whether it is needed for TUNNEL_NOCACHE users? (Among others,
> >> latter is used by BPF, for example.)
> >
> > My understanding is that TUNNEL_NOCACHE is used to decide whether or not
> > to use a dst cache, that might or might not come from the tunnel info
> > attached to an skb. The dst cache being allocated in a tunnel info is
> > orthogonal to the use of TUNNEL_NOCACHE. While looking around I actually
> > found a code path explicitly setting both, in nft_tunnel_obj_init (that
> > might need to be investigated though but it is another topic).
>
> Good point, this is coming from 3e511d5652ce ("netfilter: nft_tunnel: Add dst_cache
> support") and was added only after af308b94a2a4 ("netfilter: nf_tables: add tunnel
> support") which initially indicated TUNNEL_NOCACHE. This is indeed contradictory.
> wenxu (+Cc), ptal.
>
> > It doesn't look like initializing the dst cache would break
> > TUNNEL_NOCACHE users as ip_tunnel_dst_cache_usable would return false
> > anyway. Having said that, we probably want to unshare the dst cache only
> > if there is one already, checking for
> > 'md_dst->u.tun_info.dst_cache.cache != NULL' first.
>
> Meaning, if that is the case, we wouldn't require the dst_cache_init()
> and thus extra alloc, right? Would make sense afaics.
Meaning:
memcpy(&new_md->u.tun_info, &md_dst->u.tun_info,
sizeof(struct ip_tunnel_info) + md_size);
+#ifdef CONFIG_DST_CACHE
+ if (new_md->u.tun_info.dst_cache.cache) {
+ int ret = dst_cache_init(&new_md->u.tun_info.dst_cache,
+ GFP_ATOMIC);
+ if (ret) {
+ metadata_dst_free(new_md);
+ return ERR_PTR(ret);
+ }
+ }
+#endif
So that the cache is unshared only if there was one in the first place.
If there was no cache to unshare, we can save the extra alloc.
> db3c6139e6ea ("bpf, vxlan, geneve, gre: fix usage of dst_cache on
> xmit") had some details related to BPF use.
With the above commit if TUNNEL_NOCACHE is set the tunnel cache
shouldn't be used, regardless of it being allocated. I guess with that
and the above, we should be good.
Thanks!
Antoine
Powered by blists - more mailing lists