netdev - Re: [TEST] TCP MD5 vs kmemleak

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <5cbc0ea9-8752-423f-a6c6-274e0c4ea3cc@paulmck-laptop>
Date: Mon, 8 Jul 2024 11:29:08 -0700
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Dmitry Safonov <0x7f454c46@...il.com>
Cc: Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	rcu@...r.kernel.org, Catalin Marinas <catalin.marinas@....com>
Subject: Re: [TEST] TCP MD5 vs kmemleak

On Tue, Jul 02, 2024 at 03:08:42AM +0100, Dmitry Safonov wrote:
> Hi Paul,
> 
> Sorry for the delayed answer, I got a respiratory infection, so was in
> bed last week,
> 
> On Thu, 20 Jun 2024 at 18:02, Paul E. McKenney <paulmck@...nel.org> wrote:
> >
> > On Wed, Jun 19, 2024 at 01:33:36AM +0100, Dmitry Safonov wrote:
> [..]
> > > I'm sorry guys, that was me being inadequate.
> >
> > I know that feeling!
> 
> Thanks :)
> 
> [adding/quoting back the context of the thread that I cut previously]
> 
> > On Tue, Jun 18, 2024 at 10:02:10AM -0700, Jakub Kicinski wrote:
> > > On Tue, 18 Jun 2024 09:42:35 -0700 Paul E. McKenney wrote:
> > > > > FTR, with mptcp self-tests we hit a few kmemleak false positive on RCU
> > > > > freed pointers, that where addressed by to this patch:
> > > > >
> > > > > commit 5f98fd034ca6fd1ab8c91a3488968a0e9caaabf6
> > > > > Author: Catalin Marinas <catalin.marinas@....com>
> > > > > Date:   Sat Sep 30 17:46:56 2023 +0000
> > > > >
> > > > >     rcu: kmemleak: Ignore kmemleak false positives when RCU-freeing objects
> > > > >
> > > > > I'm wondering if this is hitting something similar? Possibly due to
> > > > > lazy RCU callbacks invoked after MSECS_MIN_AGE???
> > >
> > > Dmitry mentioned this commit, too, but we use the same config for MPTCP
> > > tests, and while we repro TCP AO failures quite frequently, mptcp
> > > doesn't seem to have failed once.
> > >
> > > > Fun!  ;-)
> > > >
> > > > This commit handles memory passed to kfree_rcu() and friends, but
> > > > not memory passed to call_rcu() and friends.  Of course, call_rcu()
> > > > does not necessarily know the full extent of the memory passed to it,
> > > > for example, if passed a linked list, call_rcu() will know only about
> > > > the head of that list.
> > > >
> > > > There are similar challenges with synchronize_rcu() and friends.
> > >
> > > To be clear I think Dmitry was suspecting kfree_rcu(), he mentioned
> > > call_rcu() as something he was expecting to have a similar issue but
> > > it in fact appeared immune.
> 
> On Thu, 20 Jun 2024 at 18:02, Paul E. McKenney <paulmck@...nel.org> wrote:
> > > That's a real issue, rather than a false-positive:
> > > https://lore.kernel.org/netdev/20240619-tcp-ao-required-leak-v1-1-6408f3c94247@gmail.com/
> >
> > So we need call_rcu() to mark memory flowing through it?  If so, we
> > need help from callers of call_rcu() in the case where more than one
> > object is being freed.
> 
> Not sure, I think one way to avoid explicitly marking pointers for
> call_rcu() or even avoiding the patch above would be a hackery like
> this:
> 
> diff --git a/mm/kmemleak.c b/mm/kmemleak.c
> index d5b6fba44fc9..7a5eb55a155c 100644
> --- a/mm/kmemleak.c
> +++ b/mm/kmemleak.c
> @@ -1587,6 +1587,7 @@ static void kmemleak_cond_resched(struct
> kmemleak_object *object)
>  static void kmemleak_scan(void)
>  {
>         struct kmemleak_object *object;
> +       unsigned long gp_start_state;
>         struct zone *zone;
>         int __maybe_unused i;
>         int new_leaks = 0;
> @@ -1630,6 +1631,7 @@ static void kmemleak_scan(void)
>                         kmemleak_cond_resched(object);
>         }
>         rcu_read_unlock();
> +       gp_start_state = get_state_synchronize_rcu();
> 
>  #ifdef CONFIG_SMP
>         /* per-cpu sections scanning */
> @@ -1690,6 +1692,13 @@ static void kmemleak_scan(void)
>          */
>         scan_gray_list();
> 
> +       /*
> +        * Wait for the greylist objects potentially migrating from
> +        * RCU callbacks or maybe getting freed.
> +        */
> +       cond_synchronize_rcu(gp_start_state);
> +       rcu_barrier();

You lost me on this one.  If you are waiting only on RCU callbacks,
the rcu_barrier() suffices.  If you are also waiting on objects to be
freed after a synchronize_rcu(), kfree_rcu() or similar, you are waiting
only for the corresponding grace period to end, not necessarily for the
freeing of objects following that synchronize_rcu().

So what exactly needs to be waited upon?

I should hasten to add that the fact that there is currently no way
to wait on kfree_rcu()'s frees is considered to be a bug, but that
bug is currently only known to be a problem in the case of a module
that uses kfree_rcu() on objects obtained from a kmem_cache structure,
and that also passes that kmem_cache structure to kmem_cache_free()
at module-unload time.

In the meantime, the workaround is to use call_rcu() instead of
kfree_rcu() in that case.  Which existing code does.

>         /*
>          * Check for new or unreferenced objects modified since the previous
>          * scan and color them gray until the next scan.
> 
> 
> -->8--
> 
> Not quite sure if this makes sense, my first time at kmemleak code,
> adding Catalin.
> But then if I didn't mess up, it's going to work only for one RCU
> period, so in case some object calls rcu/kfree_rcu() from the
> callback, it's going to be yet a false-positive.
> 
> Some other options/ideas:
> - more RCU-invasive option which would be adding a mapping function to
> segmented lists of callbacks, which would allow kmemleak to request
> from RCU if the pointer is yet referenced by RCU.
> - the third option is for kmemleak to trust RCU and generalize the
> commit above, adding kmemleak_object::flags of OBJECT_RCU, which will
> be set by call_rcu()/kfree_rcu() and unset once the callback is
> invoked for RCU_DONE_TAIL.
> - add kmemleak_object::update_jiffies or just directly touch
> kmemleak_object::jiffies whenever the object has been modified (see
> update_checksum()), that will ignore recently changed objects. As
> rcu_head should be updated, that is going to automagically ignore
> those deferred to RCU objects.

You could also make greater use of the polled RCU API, marking
an object (or, perhaps more space-efficiently, a group of
objects) using get_state_synchronize_rcu_full(), then using
poll_state_synchronize_rcu_full() to see when all readers should be done.
There are additional tricks that can be used if needed.

							Thanx, Paul

> Not sure if I mis-looked anything and it seems there is no hurry in
> fixing anything yet, as this is a theoretical issue at this moment.
> 
> Hopefully, these ideas don't look like a nonsense,
>              Dmitry