linux-kernel - Re: [PATCH 3/3] mm/kmemleak: Prevent soft lockup in first object iteration loop of kmemleak

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <325768cd-19bd-71ae-83d6-1ca5e84f7462@redhat.com>
Date:   Tue, 14 Jun 2022 14:22:40 -0400
From:   Waiman Long <longman@...hat.com>
To:     Catalin Marinas <catalin.marinas@....com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH 3/3] mm/kmemleak: Prevent soft lockup in first object
 iteration loop of kmemleak_scan()

On 6/14/22 13:27, Catalin Marinas wrote:
> On Tue, Jun 14, 2022 at 06:15:14PM +0100, Catalin Marinas wrote:
>> On Sun, Jun 12, 2022 at 02:33:01PM -0400, Waiman Long wrote:
>>> @@ -1437,10 +1440,25 @@ static void kmemleak_scan(void)
>>>   #endif
>>>   		/* reset the reference count (whiten the object) */
>>>   		object->count = 0;
>>> -		if (color_gray(object) && get_object(object))
>>> +		if (color_gray(object) && get_object(object)) {
>>>   			list_add_tail(&object->gray_list, &gray_list);
>>> +			gray_list_cnt++;
>>> +			object_pinned = true;
>>> +		}
>>>   
I may have the mistaken belief that setting count to 0 will make most 
object gray. Apparently, that may not be the case here.
>>>   		raw_spin_unlock_irq(&object->lock);
>>> +
>>> +		/*
>>> +		 * With object pinned by a positive reference count, it
>>> +		 * won't go away and we can safely release the RCU read
>>> +		 * lock and do a cond_resched() to avoid soft lockup every
>>> +		 * 64k objects.
>>> +		 */
>>> +		if (object_pinned && !(gray_list_cnt & 0xffff)) {
>>> +			rcu_read_unlock();
>>> +			cond_resched();
>>> +			rcu_read_lock();
>>> +		}
>> I'm not sure this gains much. There should be very few gray objects
>> initially (those passed to kmemleak_not_leak() for example). The
>> majority should be white objects.
>>
>> If we drop the fine-grained object->lock, we could instead take
>> kmemleak_lock outside the loop with a cond_resched_lock(&kmemleak_lock)
>> within the loop. I think we can get away with not having an
>> rcu_read_lock() at all for list traversal with the big lock outside the
>> loop.
> Actually this doesn't work is the current object in the iteration is
> freed. Does list_for_each_rcu_safe() help?

list_for_each_rcu_safe() helps if we are worrying about object being 
freed. However, it won't help if object->next is freed instead.

How about something like:

diff --git a/mm/kmemleak.c b/mm/kmemleak.c
index 7dd64139a7c7..fd836e43cb16 100644
--- a/mm/kmemleak.c
+++ b/mm/kmemleak.c
@@ -1417,12 +1417,16 @@ static void kmemleak_scan(void)
         struct zone *zone;
         int __maybe_unused i;
         int new_leaks = 0;
+       int loop1_cnt = 0;

         jiffies_last_scan = jiffies;

         /* prepare the kmemleak_object's */
         rcu_read_lock();
         list_for_each_entry_rcu(object, &object_list, object_list) {
+               bool obj_pinned = false;
+
+               loop1_cnt++;
                 raw_spin_lock_irq(&object->lock);
  #ifdef DEBUG
                 /*
@@ -1437,10 +1441,32 @@ static void kmemleak_scan(void)
  #endif
                 /* reset the reference count (whiten the object) */
                 object->count = 0;
-               if (color_gray(object) && get_object(object))
+               if (color_gray(object) && get_object(object)) {
                         list_add_tail(&object->gray_list, &gray_list);
+                       obj_pinned = true;
+               }

                 raw_spin_unlock_irq(&object->lock);
+
+               /*
+                * Do a cond_resched() to avoid soft lockup every 64k 
objects.
+                * Make sure a reference has been taken so that the object
+                * won't go away without RCU read lock.
+                */
+               if (loop1_cnt & 0xffff) {
+                       if (!obj_pinned && !get_object(object)) {
+                               /* Try the next object instead */
+                               loop1_cnt--;
+                               continue;
+                       }
+
+                       rcu_read_unlock();
+                       cond_resched();
+                       rcu_read_lock();
+
+                       if (!obj_pinned)
+                               put_object(object);
+               }
         }
         rcu_read_unlock();

Cheers,
Longman