lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 4 Nov 2014 07:57:48 +0800
From:	Wanpeng Li <wanpeng.li@...ux.intel.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	Ingo Molnar <mingo@...hat.com>,
	Kirill Tkhai <ktkhai@...allels.com>,
	Juri Lelli <juri.lelli@....com>, linux-kernel@...r.kernel.org,
	Wanpeng Li <wanpeng.li@...ux.intel.com>
Subject: Re: [PATCH RFC] sched/deadline: support dl task migrate during cpu
 hotplug

Hi Peter,
On Mon, Nov 03, 2014 at 11:41:11AM +0100, Peter Zijlstra wrote:
>On Fri, Oct 31, 2014 at 03:28:17PM +0800, Wanpeng Li wrote:
>> Hi all,
>> 
>> I observe that dl task can't be migrated to other cpus during cpu hotplug, in 
>> addition, task may/may not be running again if cpu is added back. The root cause 
>> which I found is that dl task will be throtted and removed from dl rq after 
>> comsuming all budget, which leads to stop task can't pick it up from dl rq and 
>> migrate to other cpus during hotplug. 
>> 
>> So I try two methods.
>> 
>> - add throttled dl sched_entity to a throttled_list, the list will be traversed
>>   during cpu hotplug, and the dl sched_entity will be picked and enqueue, then 
>>   stop task will pick and migrate it. However, dl sched_entity is throttled again 
>>   before stop task running since the below path. This path will set rq->online 0 
>>   which lead to set_rq_offline() won't be called in function migration_call().
>> 
>
>This seems wrong to me; this screws around with the CBS by replenishing
>too soon.

Agreed.

>
>> @@ -1593,9 +1602,20 @@ static void rq_online_dl(struct rq *rq)
>>  /* Assumes rq->lock is held */
>>  static void rq_offline_dl(struct rq *rq)
>>  {
>> +	struct task_struct *p, *n;
>> +
>>  	if (rq->dl.overloaded)
>>  		dl_clear_overload(rq);
>>  
>> +	/* Make sched_dl_entity available for pick_next_task() */
>> +	list_for_each_entry_safe(p, n, &rq->dl.throttled_list, dl.throttled_node) {
>> +		p->dl.dl_throttled = 0;
>> +		hrtimer_cancel(&p->dl.dl_timer);
>> +		p->dl.dl_runtime = p->dl.dl_runtime;
>> +		if (task_on_rq_queued(p))
>> +			enqueue_task_dl(rq, p, ENQUEUE_REPLENISH);
>> +	}
>> +
>>  	cpudl_set(&rq->rd->cpudl, rq->cpu, 0, 0);
>>  }
>
>
>So what is wrong with making dl_task_timer() deal with it? The timer
>will still fire on the correct time, canceling it and or otherwise
>messing with the CBS is wrong. Once it fires, all we need to do is
>migrate it to another cpu (preferably one that is still online of course
>:-).

Do you mean what I need to do is push the task to another cpu in dl_task_timer() 
if rq is offline? In addition, what will happen if dl task can't preempt on 
another cpu?

Regards,
Wanpeng Li 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists