linux-kernel - Re: Kmemleak infrastructure improvement for task_struct leaks and call

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200507171418.GC3180@gaia>
Date:   Thu, 7 May 2020 18:14:19 +0100
From:   Catalin Marinas <catalin.marinas@....com>
To:     "Paul E. McKenney" <paulmck@...nel.org>
Cc:     Qian Cai <cai@....pw>, Linux-MM <linux-mm@...ck.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: Kmemleak infrastructure improvement for task_struct leaks and
 call_rcu()

On Wed, May 06, 2020 at 10:40:19AM -0700, Paul E. McKenney wrote:
> On Wed, May 06, 2020 at 12:22:37PM -0400, Qian Cai wrote:
> > == call_rcu() leaks ==
> > Another issue that might be relevant is that it seems sometimes,
> > kmemleak will give a lot of false positives (hundreds) because the
> > memory was supposed to be freed by call_rcu()  (for example, in
> > dst_release()) but for some reasons, it takes a long time probably
> > waiting for grace periods or some kind of RCU self-stall, but the
> > memory had already became an orphan. I am not sure how we are going
> > to resolve this properly until we have to figure out why call_rcu()
> > is taking so long to finish?
> 
> I know nothing about kmemleak, but I won't let that stop me from making
> random suggestions...
> 
> One approach is to do an rcu_barrier() inside kmemleak just before
> printing leaked blocks, and check to see if any are still leaked after
> the rcu_barrier().

The main issue is that kmemleak doesn't stop the world when scanning
(which can take over a minute, depending on your hardware), so we get
lots of transient pointer misses. There are some heuristics but
obviously they don't always work.

With RCU, objects are queued for RCU freeing later and chained via
rcu_head.next (IIUC). Under load, this list can be pretty volatile and
if this happen during kmemleak scanning, it's sufficient to lose track
of a next pointer and the rest of the list would be reported as a leak.

I think rcu_barrier() just before the starting the kmemleak scanning may
help if it reduces the number of objects queued.

Now, I wonder whether kmemleak itself can break this RCU chain. The
kmemleak metadata is allocated on a slab alloc callback. The freeing,
however, is done using call_rcu() because originally calling back into
the slab freeing from kmemleak_free() didn't go well. Since the
kmemleak_object structure is not tracked by kmemleak, I wonder whether
its rcu_head would break this directed pointer reference graph.

Let's try the rcu_barrier() first and I'll think about the metadata case
over the weekend.

Thanks.

-- 
Catalin