[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJwJo6Y0hfB5royz0D8QPkgpRMxz9G5s81ktZbtHPBdt8EYdEQ@mail.gmail.com>
Date: Tue, 2 Jul 2024 03:08:42 +0100
From: Dmitry Safonov <0x7f454c46@...il.com>
To: paulmck@...nel.org
Cc: Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>, rcu@...r.kernel.org,
Catalin Marinas <catalin.marinas@....com>
Subject: Re: [TEST] TCP MD5 vs kmemleak
Hi Paul,
Sorry for the delayed answer, I got a respiratory infection, so was in
bed last week,
On Thu, 20 Jun 2024 at 18:02, Paul E. McKenney <paulmck@...nel.org> wrote:
>
> On Wed, Jun 19, 2024 at 01:33:36AM +0100, Dmitry Safonov wrote:
[..]
> > I'm sorry guys, that was me being inadequate.
>
> I know that feeling!
Thanks :)
[adding/quoting back the context of the thread that I cut previously]
> On Tue, Jun 18, 2024 at 10:02:10AM -0700, Jakub Kicinski wrote:
> > On Tue, 18 Jun 2024 09:42:35 -0700 Paul E. McKenney wrote:
> > > > FTR, with mptcp self-tests we hit a few kmemleak false positive on RCU
> > > > freed pointers, that where addressed by to this patch:
> > > >
> > > > commit 5f98fd034ca6fd1ab8c91a3488968a0e9caaabf6
> > > > Author: Catalin Marinas <catalin.marinas@....com>
> > > > Date: Sat Sep 30 17:46:56 2023 +0000
> > > >
> > > > rcu: kmemleak: Ignore kmemleak false positives when RCU-freeing objects
> > > >
> > > > I'm wondering if this is hitting something similar? Possibly due to
> > > > lazy RCU callbacks invoked after MSECS_MIN_AGE???
> >
> > Dmitry mentioned this commit, too, but we use the same config for MPTCP
> > tests, and while we repro TCP AO failures quite frequently, mptcp
> > doesn't seem to have failed once.
> >
> > > Fun! ;-)
> > >
> > > This commit handles memory passed to kfree_rcu() and friends, but
> > > not memory passed to call_rcu() and friends. Of course, call_rcu()
> > > does not necessarily know the full extent of the memory passed to it,
> > > for example, if passed a linked list, call_rcu() will know only about
> > > the head of that list.
> > >
> > > There are similar challenges with synchronize_rcu() and friends.
> >
> > To be clear I think Dmitry was suspecting kfree_rcu(), he mentioned
> > call_rcu() as something he was expecting to have a similar issue but
> > it in fact appeared immune.
On Thu, 20 Jun 2024 at 18:02, Paul E. McKenney <paulmck@...nel.org> wrote:
> > That's a real issue, rather than a false-positive:
> > https://lore.kernel.org/netdev/20240619-tcp-ao-required-leak-v1-1-6408f3c94247@gmail.com/
>
> So we need call_rcu() to mark memory flowing through it? If so, we
> need help from callers of call_rcu() in the case where more than one
> object is being freed.
Not sure, I think one way to avoid explicitly marking pointers for
call_rcu() or even avoiding the patch above would be a hackery like
this:
diff --git a/mm/kmemleak.c b/mm/kmemleak.c
index d5b6fba44fc9..7a5eb55a155c 100644
--- a/mm/kmemleak.c
+++ b/mm/kmemleak.c
@@ -1587,6 +1587,7 @@ static void kmemleak_cond_resched(struct
kmemleak_object *object)
static void kmemleak_scan(void)
{
struct kmemleak_object *object;
+ unsigned long gp_start_state;
struct zone *zone;
int __maybe_unused i;
int new_leaks = 0;
@@ -1630,6 +1631,7 @@ static void kmemleak_scan(void)
kmemleak_cond_resched(object);
}
rcu_read_unlock();
+ gp_start_state = get_state_synchronize_rcu();
#ifdef CONFIG_SMP
/* per-cpu sections scanning */
@@ -1690,6 +1692,13 @@ static void kmemleak_scan(void)
*/
scan_gray_list();
+ /*
+ * Wait for the greylist objects potentially migrating from
+ * RCU callbacks or maybe getting freed.
+ */
+ cond_synchronize_rcu(gp_start_state);
+ rcu_barrier();
+
/*
* Check for new or unreferenced objects modified since the previous
* scan and color them gray until the next scan.
-->8--
Not quite sure if this makes sense, my first time at kmemleak code,
adding Catalin.
But then if I didn't mess up, it's going to work only for one RCU
period, so in case some object calls rcu/kfree_rcu() from the
callback, it's going to be yet a false-positive.
Some other options/ideas:
- more RCU-invasive option which would be adding a mapping function to
segmented lists of callbacks, which would allow kmemleak to request
from RCU if the pointer is yet referenced by RCU.
- the third option is for kmemleak to trust RCU and generalize the
commit above, adding kmemleak_object::flags of OBJECT_RCU, which will
be set by call_rcu()/kfree_rcu() and unset once the callback is
invoked for RCU_DONE_TAIL.
- add kmemleak_object::update_jiffies or just directly touch
kmemleak_object::jiffies whenever the object has been modified (see
update_checksum()), that will ignore recently changed objects. As
rcu_head should be updated, that is going to automagically ignore
those deferred to RCU objects.
Not sure if I mis-looked anything and it seems there is no hurry in
fixing anything yet, as this is a theoretical issue at this moment.
Hopefully, these ideas don't look like a nonsense,
Dmitry
Powered by blists - more mailing lists