linux-kernel - Re: [PATCH 2/2 v4] lib: debugobjects: handle objects free in a batch outside the loop

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <38431dd0-608d-0fe2-3b9e-024c4ea3565e@linux.alibaba.com>
Date:   Wed, 31 Jan 2018 10:51:13 -0800
From:   Yang Shi <yang.shi@...ux.alibaba.com>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     longman@...hat.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/2 v4] lib: debugobjects: handle objects free in a batch
 outside the loop



On 1/31/18 8:01 AM, Thomas Gleixner wrote:
> On Thu, 25 Jan 2018, Yang Shi wrote:
>
>> There are nested loops on debug objects free path, sometimes it may take
>> over hundred thousands of loops, then cause soft lockup with
>> !CONFIG_PREEMPT occasionally, like below:
>>
>> NMI watchdog: BUG: soft lockup - CPU#15 stuck for 22s! [stress-ng-getde:110342]
>>
>>   CPU: 15 PID: 110342 Comm: stress-ng-getde Tainted: G
>> E   4.9.44-003.ali3000.alios7.x86_64.debug #1
>>   Hardware name: Dell Inc. PowerEdge R720xd/0X6FFV, BIOS
>> 1.6.0 03/07/2013
>>   task: ffff884cbb0d0000 task.stack: ffff884cabc70000
>>   RIP: 0010:[<ffffffff817d647b>]  [<ffffffff817d647b>]
>> _raw_spin_unlock_irqrestore+0x3b/0x60
>>   RSP: 0018:ffff884cabc77b78  EFLAGS: 00000292
>>   RAX: ffff884cbb0d0000 RBX: 0000000000000292 RCX: 0000000000000000
>>   RDX: ffff884cbb0d0000 RSI: 0000000000000001 RDI: 0000000000000292
>>   RBP: ffff884cabc77b88 R08: 0000000000000000 R09: 0000000000000000
>>   R10: 0000000000000001 R11: 0000000000000001 R12: ffffffff8357a0d8
>>   R13: ffff884cabc77bc8 R14: ffffffff8357a0d0 R15: 00000000000000fc
>>   FS:  00002aee845fd2c0(0000) GS:ffff8852bd400000(0000)
>> knlGS:0000000000000000
>>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>   CR2: 0000000002991808 CR3: 0000005123abf000 CR4: 00000000000406e0
>>   Stack:
>>    ffff884ff4fe0000 ffff884ff4fd8000 ffff884cabc77c00 ffffffff8141177e
>>    0000000000000202 ffff884cbb0d0000 ffff884cabc77bc8 0000000000000006
>>    ffff884ff4fda000 ffffffff8357a0d8 0000000000000000 91f5d976f6020b6c
> Please trim that. The register content is not really interesting

Yes, sure.

>
>> @@ -186,11 +199,25 @@ static struct debug_obj *lookup_object(void *addr, struct debug_bucket *b)
>>   static void free_obj_work(struct work_struct *work)
>>   {
>>   	struct debug_obj *objs[ODEBUG_FREE_BATCH];
>> +	struct hlist_node *tmp;
>> +	struct debug_obj *obj;
>>   	unsigned long flags;
>>   	int i;
>>   
>>   	if (!raw_spin_trylock_irqsave(&pool_lock, flags))
>>   		return;
>> +
>> +	/* Move free obj to pool list from global free list */
> If the pool is full, why would you shuffle those objects to the pool list
> first just to free them 10 lines further down?

Yes, thanks for pointing out this. I should checked if pool list is full 
or not before moving free objects to pool list.

>
>> +	if (obj_free > 0) {
>    obj_nr_tofree might be a more descriptive name for this variable

OK.

>
>> +		hlist_for_each_entry_safe(obj, tmp, &obj_to_free, node) {
>> +			hlist_del(&obj->node);
>> +			hlist_add_head(&obj->node, &obj_pool);
>> +			obj_pool_free++;
>> +			obj_pool_used--;
>> +			obj_free--;
>> +		}
>> +	}
> The other thing here is that this whole list walk operation happens with
> the pool lock held and interrupts disabled. That's suboptimal at best.
>
> So the right thing to do here is:
>
> 	HLIST_HEAD(tofree);
> 		
>    	if (!raw_spin_trylock_irqsave(&pool_lock, flags))
>    		return;
>
> 	while (obj_pool_free < debug_objects_pool_size) {
> 		if (!obj_nr_tofree)
> 			break:
> 		hlist_del(...)
> 		hlist_add(...)
> 	}
>
> 	if (obj_nr_tofree) {
> 		hlist_move_list(&obj_to_free, &freelist);
> 		....
> 	}
>
> 	while (obj_pool_free >= debug_objects_pool_size + ODEBUG_FREE_BATCH) {
> 	      ....
> 	}
>
> 	raw_spin_unlock_irqrestore(&pool_lock, flags);
>
> 	hlist_for_each_entry_safe(obj, tmp, &tofree, node) {
> 		hlist_del();
> 		kmem_cache_free(....);
> 	}
>
> That way you minimize the lock held times and spare pointless list walks
> and shuffling.

Thanks a lot for the suggestion, will fix these in new version.

Regards,
Yang

>
> Thanks,
>
> 	tglx