lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 21 Apr 2016 20:25:14 +0800
From:	Wanpeng Li <kernellwp@...il.com>
To:	"Rafael J. Wysocki" <rafael.j.wysocki@...el.com>
Cc:	Peter Zijlstra <peterz@...radead.org>,
	"Rafael J. Wysocki" <rjw@...ysocki.net>,
	Ingo Molnar <mingo@...hat.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Wanpeng Li <wanpeng.li@...mail.com>,
	Linux PM list <linux-pm@...r.kernel.org>,
	Steve Muckle <steve.muckle@...aro.org>
Subject: Re: [PATCH] sched/cpufreq: don't trigger cpufreq update w/o real
 rt/deadline tasks running

2016-04-21 20:12 GMT+08:00 Wanpeng Li <kernellwp@...il.com>:
> 2016-04-21 19:11 GMT+08:00 Rafael J. Wysocki <rafael.j.wysocki@...el.com>:
>> On 4/21/2016 3:09 AM, Wanpeng Li wrote:
>>>
>>> 2016-04-21 6:28 GMT+08:00 Rafael J. Wysocki <rafael.j.wysocki@...el.com>:
>>>>
>>>> On 4/21/2016 12:24 AM, Wanpeng Li wrote:
>>>>>
>>>>> 2016-04-20 22:01 GMT+08:00 Peter Zijlstra <peterz@...radead.org>:
>>>>>>
>>>>>> On Wed, Apr 20, 2016 at 02:32:35AM +0200, Rafael J. Wysocki wrote:
>>>>>>>
>>>>>>> On Monday, April 18, 2016 01:51:24 PM Wanpeng Li wrote:
>>>>>>>>
>>>>>>>> Sometimes update_curr() is called w/o tasks actually running, it is
>>>>>>>> captured by:
>>>>>>>>       u64 delta_exec = rq_clock_task(rq) - curr->se.exec_start;
>>>>>>>> We should not trigger cpufreq update in this case for rt/deadline
>>>>>>>> classes, and this patch fix it.
>>>>>>>>
>>>>>>>> Signed-off-by: Wanpeng Li <wanpeng.li@...mail.com>
>>>>>>>
>>>>>>> The signed-off-by tag should agree with the From: header.  One way to
>>>>>>> achieve
>>>>>>> that is to add an extra From: line at the start of the changelog.
>>>>>>>
>>>>>>> That said, this looks like a good catch that should go into 4.6 to me.
>>>>>>>
>>>>>>> Peter, what do you think?
>>>>>>
>>>>>> I'm confused by the Changelog. *what* ?
>>>>>
>>>>> Sometimes .update_curr hook is called w/o tasks actually running, it is
>>>>> captured by:
>>>>>
>>>>>           u64 delta_exec = rq_clock_task(rq) - curr->se.exec_start;
>>>>>
>>>>> We should not trigger cpufreq update in this case for rt/deadline
>>>>> classes, and this patch fix it.
>>>>
>>>>
>>>> That's what you wrote in the changelog, no need to repeat that.
>>>>
>>>> I guess Peter is asking for more details, though.  I actually would like
>>>> to
>>>> get some more details here too.  Like an example of when the situation in
>>>> question actually happens.
>>>
>>> I add a print to print when delta_exec is zero for rt class, something
>>> like below:
>>>
>>>        watchdog/5-48    [005] d...   568.449095: update_curr_rt: rt
>>> delta_exec is zero
>>>        watchdog/5-48    [005] d...   568.449104: <stack trace>
>>>   => pick_next_task_rt
>>>   => __schedule
>>>   => schedule
>>>   => smpboot_thread_fn
>>>   => kthread
>>>   => ret_from_fork
>>>        watchdog/5-48    [005] d...   568.449105: update_curr_rt: rt
>>> delta_exec is zero
>>>        watchdog/5-48    [005] d...   568.449111: <stack trace>
>>>   => put_prev_task_rt
>>>   => pick_next_task_idle
>>>   => __schedule
>>>   => schedule
>>>   => smpboot_thread_fn
>>>   => kthread
>>>   => ret_from_fork
>>>        watchdog/6-56    [006] d...   568.510094: update_curr_rt: rt
>>> delta_exec is zero
>>>        watchdog/6-56    [006] d...   568.510103: <stack trace>
>>>   => pick_next_task_rt
>>>   => __schedule
>>>   => schedule
>>>   => smpboot_thread_fn
>>>   => kthread
>>>   => ret_from_fork
>>>        watchdog/6-56    [006] d...   568.510105: update_curr_rt: rt
>>> delta_exec is zero
>>>        watchdog/6-56    [006] d...   568.510111: <stack trace>
>>>   => put_prev_task_rt
>>>   => pick_next_task_idle
>>>   => __schedule
>>>   => schedule
>>>   => smpboot_thread_fn
>>>   => kthread
>>>   => ret_from_fork
>>> [...]
>>
>>
>> And the statement in your changelog follows from this I suppose. How does it
>> follow, exactly?
>
> For example, rt task A will go to sleep, an rt task B is the next
> candidate to run.
>
> __schedule()
>     -> deactivate_task(A, DEQUEUE_SLEEP)
>         -> dequeue_task_rt()
>             -> update_curr_rt()
>                 -> cpufreq_trigger_update()
>                 -> delta_exec = rq_clock_task(rq) - curr->se.exec_start;
>     [...]
>     -> pick_next_task_rt()
>         -> update_curr_rt()          =>   rq->curr is still A currently
>             -> cpufreq_trigger_update()
>             -> delta_exec = rq_clock_task(rq) - curr->se.exec_start;
>   => delta == 0, actually A is not running between these two updates
>     if (likely(prev != next)) {
>         rq->curr = B;
>        [...]
>      }

Actually I suspect that there is another cpufreq update w/ delta == 0
due to pick_next_task_rt() currently implementation:

if (prev->sched_class == &rt_sched_class)
    update_curr(rq);    =>   rq->curr is still A currently
[...]
put_prev_task(rq, prev);
    -> update_curr(rq);    =>   rq->curr is still A currently

Regards,
Wanpeng Li

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ