lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <66e526c8-9d06-460b-b5df-92697634106b@redhat.com>
Date:   Tue, 28 Nov 2023 21:35:34 -0500
From:   Waiman Long <longman@...hat.com>
To:     Catalin Marinas <catalin.marinas@....com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm/kmemleak: Add cond_resched() to kmemleak_free_percpu()


On 11/28/23 11:04, Catalin Marinas wrote:
> On Mon, Nov 27, 2023 at 02:41:53PM -0500, Waiman Long wrote:
>>   /**
>>    * kmemleak_free_percpu - unregister a previously registered __percpu object
>>    * @ptr:	__percpu pointer to beginning of the object
>>    *
>>    * This function is called from the kernel percpu allocator when an object
>> - * (memory block) is freed (free_percpu).
>> + * (memory block) is freed (free_percpu). Since this function is inherently
>> + * slow especially on systems with a large number of CPUs, defer the actual
>> + * removal of kmemleak objects associated with the percpu pointer to a
>> + * workqueue if it is not in a task context.
>>    */
>>   void __ref kmemleak_free_percpu(const void __percpu *ptr)
>>   {
>> -	unsigned int cpu;
>> -
>>   	pr_debug("%s(0x%px)\n", __func__, ptr);
>>   
>> -	if (kmemleak_free_enabled && ptr && !IS_ERR(ptr))
>> -		for_each_possible_cpu(cpu)
>> -			delete_object_full((unsigned long)per_cpu_ptr(ptr,
>> -								      cpu));
>> +	if (!kmemleak_free_enabled || !ptr || IS_ERR(ptr))
>> +		return;
>> +
>> +	if (!in_task()) {
>> +		struct kmemleak_percpu_addr *addr;
>> +
>> +		addr = kzalloc(sizeof(*addr), GFP_ATOMIC);
>> +		if (addr) {
>> +			INIT_WORK(&addr->work, kmemleak_free_percpu_workfn);
>> +			addr->ptr = ptr;
>> +			queue_work(system_long_wq, &addr->work);
>> +			return;
>> +		}
> We can't defer this freeing. It can mess up the kmemleak metadata if the
> per-cpu pointer is re-allocated before kmemleak removed it from its
> object tree.
You are right. In fact, it is possible for kmemleak_free_percpu() be 
called from softIRQ context. And if the system has hundreds of CPUs, it 
will take a long time to process all the free request.
>
> The problem is looking up the object tree for each per-cpu offset. We
> can make the percpu pointer handling O(1) since freeing is only done by
> the main __percpu pointer, so that's the only one needing a look-up. So
> far the per-cpu pointers are not tracked for leaking, only scanned.
>
> We could just add the per_cpu_ptr(ptr, 0) to the kmemleak
> object_tree_root but when scanning we don't have an inverse function to
> get the __percpu pointer back and calculate the pointers for the other
> CPUs (well, we could with some hacks but they are probably fragile).
We could keep a separate tree to track the percpu area. We will know the 
max percpu offset in each percpu area. The base of the percpu area is 
just per_cpu_ptr(0, cpu).
>
> What I came up with is a separate object_percpu_tree_root similar to the
> object_phys_tree_root. The only reason for these additional trees is to
> look up the kmemleak metadata when needed (usually freeing). They don't
> contain objects that are tracked for actual leaking, only scanned. A
> briefly tested patch below. I need to go through it again, update some
> comments and write a commit log:

That sounds like a good idea like what I have said above. I will do a 
more careful review of the change tomorrow as it is getting late for me 
today.

Cheers,
Longman


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ