[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aCyI8T2sWlPLEYZ_@arm.com>
Date: Tue, 20 May 2025 14:51:45 +0100
From: Catalin Marinas <catalin.marinas@....com>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: Jared Kangas <jkangas@...hat.com>, willy@...radead.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] radix tree: fix kmemleak false positive in
radix_tree_shrink()
On Wed, May 14, 2025 at 03:16:05PM -0700, Andrew Morton wrote:
> On Wed, 14 May 2025 11:01:37 -0700 Jared Kangas <jkangas@...hat.com> wrote:
> > Mark the new xa_head as a transient leak to prevent this false positive
> > report.
> >
> > ...
> >
> > --- a/lib/radix-tree.c
> > +++ b/lib/radix-tree.c
> > @@ -509,6 +509,14 @@ static inline bool radix_tree_shrink(struct radix_tree_root *root)
> > if (is_idr(root) && !tag_get(node, IDR_FREE, 0))
> > root_tag_clear(root, IDR_FREE);
> >
> > + /*
> > + * Kmemleak might report a false positive if it traverses the
> > + * tree while we're shrinking it, since the reference moves
> > + * from node->slots[0] to root->xa_head.
> > + */
> > + if (radix_tree_is_internal_node(child))
> > + kmemleak_transient_leak(entry_to_node(child));
> > +
>
> There is only one other caller of kmemleak_transient_leak(). Makes me
> think that perhaps a more fundamental fix is needed.
The other user is iova_depot_pop() where a struct iova_magazine object
is temporarily unreachable. But you have a point, we could try to change
the heuristics a bit to reduce such false positives. The only way to
guarantee is to use stop_machine() during scanning but that can take
minutes, so not really feasible.
We do have some heuristics as the checksum calculation that works well
for doubly-linked lists. However, when the reference to an object is
moved from one location to another while the object itself is not
updated, this won't be caught.
That said, the current logic updates the checksum when it was found
unreferenced (potential leak). If the object was referenced in
subsequent scans, the checksum remains intact, so minutes later it may
be seen again as a transient leak. Something like below should reduce
the transient leak reports (though not eliminate):
diff --git a/mm/kmemleak.c b/mm/kmemleak.c
index c12cef3eeb32..b1460c64f4b1 100644
--- a/mm/kmemleak.c
+++ b/mm/kmemleak.c
@@ -1408,6 +1408,12 @@ static void update_refs(struct kmemleak_object *object)
/* put_object() called when removing from gray_list */
WARN_ON(!get_object(object));
list_add_tail(&object->gray_list, &gray_list);
+ /*
+ * Reset the checksum; the object must be unchanged between
+ * two consecutive scans where it was unreferenced
+ * (color_white()) in order to be reported as a leak.
+ */
+ object->checksum = 0;
}
}
--
Catalin
Powered by blists - more mailing lists