[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <55978b11-5e7e-4b10-dff1-398275ec68b3@redhat.com>
Date: Fri, 20 Jan 2023 17:54:28 -0500
From: Waiman Long <longman@...hat.com>
To: Catalin Marinas <catalin.marinas@....com>
Cc: Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org,
Muchun Song <songmuchun@...edance.com>
Subject: Re: [RESEND PATCH v2 2/2] mm/kmemleak: Fix UAF bug in kmemleak_scan()
On 1/20/23 14:18, Catalin Marinas wrote:
> Hi Waiman,
>
> Thanks for your effort on trying to fix this.
>
> On Wed, Jan 18, 2023 at 11:01:11PM -0500, Waiman Long wrote:
>> @@ -567,7 +574,9 @@ static void __remove_object(struct kmemleak_object *object)
>> rb_erase(&object->rb_node, object->flags & OBJECT_PHYS ?
>> &object_phys_tree_root :
>> &object_tree_root);
>> - list_del_rcu(&object->object_list);
>> + if (!(object->del_state & DELSTATE_NO_DELETE))
>> + list_del_rcu(&object->object_list);
>> + object->del_state |= DELSTATE_REMOVED;
>> }
> So IIUC, this prevents the current object being scanned from being
> removed from the list during the kmemleak_cond_resched() call.
Yes, that is the point.
>
>> /*
>> @@ -633,6 +642,7 @@ static void __create_object(unsigned long ptr, size_t size,
>> object->count = 0; /* white color initially */
>> object->jiffies = jiffies;
>> object->checksum = 0;
>> + object->del_state = 0;
>>
>> /* task information */
>> if (in_hardirq()) {
>> @@ -1470,9 +1480,22 @@ static void kmemleak_cond_resched(struct kmemleak_object *object)
>> if (!get_object(object))
>> return; /* Try next object */
>>
>> + raw_spin_lock_irq(&kmemleak_lock);
>> + if (object->del_state & DELSTATE_REMOVED)
>> + goto unlock_put; /* Object removed */
>> + object->del_state |= DELSTATE_NO_DELETE;
>> + raw_spin_unlock_irq(&kmemleak_lock);
>> +
>> rcu_read_unlock();
>> cond_resched();
>> rcu_read_lock();
>> +
>> + raw_spin_lock_irq(&kmemleak_lock);
>> + if (object->del_state & DELSTATE_REMOVED)
>> + list_del_rcu(&object->object_list);
>> + object->del_state &= ~DELSTATE_NO_DELETE;
>> +unlock_put:
>> + raw_spin_unlock_irq(&kmemleak_lock);
>> put_object(object);
>> }
> I'm not sure this was the only problem. We do have the problem that the
> current object may be removed from the list, solved above, but another
> scenario I had in mind is the next object being released during this
> brief resched period. The RCU relies on object->next->next being valid
> but, with a brief rcu_read_unlock(), the object->next could be freed,
> reallocated, so object->next->next invalid.
Looking at the following scenario,
object->next => A (removed)
A->next => B (removed)
As object->next is pointing to A, A must still be allocated and not
freed yet. Now if B is also removed, there are 2 possible case.
1) B is removed from the list after the removal of A. In that case, it
is not possible that A is allocated, but B is freed.
2) B is removed before A. A->next can't pointed to B when it is being
removed. Due to weak memory ordering, it is possible that another cpu
can see A->next still pointing to B. In that case, I believe that it is
still within the grace period where neither A or B is freed.
In fact, it is no different from a regular scanning of the object list
without ever called cond_resched().
Cheers,
Longman
Powered by blists - more mailing lists