[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Z0D4dCaAf4CVJTde@pop-os.localdomain>
Date: Fri, 22 Nov 2024 13:32:36 -0800
From: Cong Wang <xiyou.wangcong@...il.com>
To: Alexandre Ferrieux <alexandre.ferrieux@...il.com>
Cc: Jakub Kicinski <kuba@...nel.org>, edumazet@...gle.com, jhs@...atatu.com,
jiri@...nulli.us, horms@...nel.org, netdev@...r.kernel.org
Subject: Re: RFC: chasing all idr_remove() misses
On Tue, Nov 19, 2024 at 07:46:32AM +0100, Alexandre Ferrieux wrote:
> On 19/11/2024 04:51, Cong Wang wrote:
> > On Thu, Nov 14, 2024 at 07:24:27PM +0100, Alexandre Ferrieux wrote:
> >> Hi,
> >>
> >> In the recent fix of u32's IDR leaks, one side remark is that the problem went
> >> unnoticed for 7 years due to the NULL result from idr_remove() being ignored at
> >> this call site.
> >> [...]
> >> So, unless we have reasons to think cls_u32 was the only place where two ID
> >> encodings might lend themselves to confusion, I'm wondering if it wouldn't make
> >> sense to chase the issue more systematically:
> >>
> >> - either with WARN_ON[_ONCE](idr_remove()==NULL) on each call site individually
> >> (a year-long endeavor implying tens of maintainers)
> >>
> >> - or with WARN_ON[_ONCE] just before returning NULL within idr_remove() itself,
> >> or even radix_tree_delete_item().
> >>
> >> Opinions ?
> >
> > Yeah, or simply WARN_ON uncleaned IDR in idr_destroy(), which is a more
> > common pattern.
>
> No, in the general case, idr_destroy() only happens at the end of life of an IDR
> set. Some structures in the kernel have a long lifetime, which means possibly
> splipping out of fuzzers' scrutiny.
Sure, move it to where you believe is appropriate.
It is a very common pattern we detect resource leakage when destroying,
for a quick example, in inet_sock_destruct() we detect skb accounting
leaks:
153 WARN_ON_ONCE(atomic_read(&sk->sk_rmem_alloc));
154 WARN_ON_ONCE(refcount_read(&sk->sk_wmem_alloc));
155 WARN_ON_ONCE(sk->sk_wmem_queued);
156 WARN_ON_ONCE(sk_forward_alloc_get(sk));
Another example of IDR leakage detection can be found in
drivers/gpu/drm/vmwgfx/ttm_object.c:
447 void ttm_object_device_release(struct ttm_object_device **p_tdev)
448 {
449 struct ttm_object_device *tdev = *p_tdev;
450
451 *p_tdev = NULL;
452
453 WARN_ON_ONCE(!idr_is_empty(&tdev->idr));
454 idr_destroy(&tdev->idr);
455
456 kfree(tdev);
457 }
>
> As an illustration, in cls_u32 itself, in the 2048-delete-add loop I use in the
> tdc test committed with the fix, idr_destroy(&tp_c->handle_idr) is called only
> at the "cleanup" step, when deleting the interface.
>
> You can only imagine, in the hundreds of other uses of IDR, the "miss rate" that
> would follow from targeting idr_destroy() instead of idr_remove().
>
I am not saying it is suitable for this specific case, I am just saying it is a
common pattern for you to consier, that's all.
Thanks.
Powered by blists - more mailing lists