[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <f8e571cd-804f-4f57-9fd8-d1697a7a2de3@redhat.com>
Date: Tue, 15 Apr 2025 11:59:04 +0200
From: Paolo Abeni <pabeni@...hat.com>
To: Justin Iurman <justin.iurman@...ege.be>, netdev@...r.kernel.org
Cc: davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
horms@...nel.org
Subject: Re: [PATCH net-next 2/2] net: ipv6: ioam6: fix double reallocation
On 4/10/25 5:24 PM, Justin Iurman wrote:
> If the dst_entry is the same post transformation (which is a valid use
> case for IOAM), we don't add it to the cache to avoid a reference loop.
> Instead, we use a "fake" dst_entry and add it to the cache as a signal.
> When we read the cache, we compare it with our "fake" dst_entry and
> therefore detect if we're in the special case.
>
> Signed-off-by: Justin Iurman <justin.iurman@...ege.be>
> ---
> net/ipv6/ioam6_iptunnel.c | 40 +++++++++++++++++++++++++++++++++++++--
> 1 file changed, 38 insertions(+), 2 deletions(-)
>
> diff --git a/net/ipv6/ioam6_iptunnel.c b/net/ipv6/ioam6_iptunnel.c
> index 57200b9991a1..bbfb7dd7fa61 100644
> --- a/net/ipv6/ioam6_iptunnel.c
> +++ b/net/ipv6/ioam6_iptunnel.c
> @@ -38,6 +38,7 @@ struct ioam6_lwt_freq {
> };
>
> struct ioam6_lwt {
> + struct dst_entry null_dst;
> struct dst_cache cache;
> struct ioam6_lwt_freq freq;
> atomic_t pkt_cnt;
> @@ -177,6 +178,16 @@ static int ioam6_build_state(struct net *net, struct nlattr *nla,
> if (err)
> goto free_lwt;
>
> + /* We set DST_NOCOUNT, even though this "fake" dst_entry will be stored
> + * in a dst_cache, which will call dst_hold() and dst_release() on it.
> + * These functions don't check for DST_NOCOUNT and modify the reference
> + * count anyway. This is not really a problem, as long as we make sure
> + * that dst_destroy() won't be called (which is the case since the
> + * initial refcount is 1, then +1 to store it in the cache, and then
> + * +1/-1 each time we read the cache and release it).
AFAICS the DST_NOCOUNT flag is not related to the dst reference
counting, but to the dst number accounting for gc's purpose.
The above comment is misleading and should be rephrased, to avoid
confusing whoever is going to look at this code later.
> + */
> + dst_init(&ilwt->null_dst, NULL, NULL, DST_OBSOLETE_NONE, DST_NOCOUNT);
> +
> atomic_set(&ilwt->pkt_cnt, 0);
> ilwt->freq.k = freq_k;
> ilwt->freq.n = freq_n;
> @@ -356,6 +367,17 @@ static int ioam6_output(struct net *net, struct sock *sk, struct sk_buff *skb)
> dst = dst_cache_get(&ilwt->cache);
> local_bh_enable();
>
> + /* This is how we notify that the destination does not change after
> + * transformation and that we need to use orig_dst instead of the cache
> + */
> + if (dst == &ilwt->null_dst) {
> + dst_release(dst);
> +
> + dst = orig_dst;
> + /* keep refcount balance: dst_release() is called at the end */
> + dst_hold(dst);
> + }
> +
> switch (ilwt->mode) {
> case IOAM6_IPTUNNEL_MODE_INLINE:
> do_inline:
> @@ -408,8 +430,18 @@ static int ioam6_output(struct net *net, struct sock *sk, struct sk_buff *skb)
> goto drop;
> }
>
> - /* cache only if we don't create a dst reference loop */
> - if (orig_dst->lwtstate != dst->lwtstate) {
> + /* If the destination is the same after transformation (which is
> + * a valid use case for IOAM), then we don't want to add it to
> + * the cache in order to avoid a reference loop. Instead, we add
> + * our fake dst_entry to the cache as a way to detect this case.
> + * Otherwise, we add the resolved destination to the cache.
> + */
> + if (orig_dst->lwtstate == dst->lwtstate) {
> + local_bh_disable();
> + dst_cache_set_ip6(&ilwt->cache,
> + &ilwt->null_dst, &fl6.saddr);
> + local_bh_enable();
> + } else {
> local_bh_disable();
> dst_cache_set_ip6(&ilwt->cache, dst, &fl6.saddr);
> local_bh_enable();
Possibly move the BH disable/enable around the if statement for
smaller/more readable code.
Otherwise LGTM, thanks!
Paolo
Powered by blists - more mailing lists