lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <m2ilo779f9.fsf@gmail.com>
Date:   Sat, 09 Jul 2022 02:19:42 +0800
From:   Schspa Shi <schspa@...il.com>
To:     Steven Rostedt <rostedt@...dmis.org>
Cc:     mingo@...hat.com, peterz@...radead.org, juri.lelli@...hat.com,
        vincent.guittot@...aro.org, dietmar.eggemann@....com,
        bsegall@...gle.com, mgorman@...e.de, bristot@...hat.com,
        vschneid@...hat.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3] sched/rt: fix bad task migration for rt tasks


Steven Rostedt <rostedt@...dmis.org> writes:

> On Fri, 08 Jul 2022 12:51:14 +0800
> Schspa Shi <schspa@...il.com> wrote:
>
>> Steven Rostedt <rostedt@...dmis.org> writes:
>> 
>> > On Fri,  8 Jul 2022 00:50:14 +0800
>> > Schspa Shi <schspa@...il.com> wrote:
>> >  
>> >> Please refer to the following scenarios.  
>> >
>> > I'm not sure this is what is happening. Do you have a trace to 
>> > back this up?
>> >  
>> 
>> I don't have a trace. This is inferred from the exception log.
>> 
>> >> 
>> >>            CPU0                                  CPU1
>> >> ------------------------------------------------------------------
>> >> push_rt_task
>> >>   check is_migration_disabled(next_task)
>> >>                                         task not running and
>> >>                                         migration_disabled == 0
>> >>   find_lock_lowest_rq(next_task, rq);
>> >>     _double_lock_balance(this_rq, busiest);
>> >>       raw_spin_rq_unlock(this_rq);
>> >>       double_rq_lock(this_rq, busiest);
>> >>         <<wait for busiest rq>>
>> >>                                             <wakeup>  
>> >
>> > Here's the problem I have. next_task is queued on CPU0, 
>> > (otherwise CPU0
>> > would not be pushing it). As CPU0 is currently running 
>> > push_rt_task, how
>> > did next_task start running to set its migrate_disable flag?  
>> 
>> THe next_task wasn't queued on CPU0, it's queued on CPU1 in this
>> scenarios.
>
> Bah, I forgot that we still do pushing for other CPUs. I was thinking that
> we removed that in favor of pulling. It's been a while since I worked on
> this.
>
>> 
>> And it's because when task wakup, the rq argument is not the
>> current running CPU rq, it's next_task's rq
>> (i.e. CPU1's rq in this sample scenarios).
>> 
>> And you can check this with the Call trace from the crash log.
>> 
>>     [123671.996969] Call trace:
>>     [123671.996975]  set_task_cpu+0x8c/0x108
>>     [123671.996984]  push_rt_task.part.0+0x144/0x184
>>     [123671.996995]  push_rt_tasks+0x28/0x3c
>>     [123671.997002]  task_woken_rt+0x58/0x68
>>     [123671.997009]  ttwu_do_wakeup+0x5c/0xd0
>>     [123671.997019]  ttwu_do_activate+0xc0/0xd4
>>     [123671.997028]  try_to_wake_up+0x244/0x288
>>     [123671.997036]  wake_up_process+0x18/0x24
>>     [123671.997045]  __irq_wake_thread+0x64/0x80
>>     [123671.997056]  __handle_irq_event_percpu+0x110/0x124
>> 
>> Function ttwu_do_wakeup will lock the task's rq, not current 
>> running
>> cpu rq.
>> 
>> >
>> > Even if it was woken up on another CPU and ran there, by setting
>> > migrate_disable, it would not be put back to CPU0, because its
>> > migrate_disable flag is set (if it is, then there's the bug).
>> >  
>> 
>> It no needs to put it back to CPU0 for this issue, it's still on 
>> CPU1.
>> 
>
> Worse things can actually happen then migrating a migrate disabled task.
> What prevents next_task from being scheduled and in a running state, or
> even migrated?
>
> Hmm, that's covered in find_lock_lowest_rq().
>
> Looks like the the migrate disable check needs to go there.
>
> 		/* if the prio of this runqueue changed, try again */
> 		if (double_lock_balance(rq, lowest_rq)) {
> 			/*
> 			 * We had to unlock the run queue. In
> 			 * the mean time, task could have
> 			 * migrated already or had its affinity changed.
> 			 * Also make sure that it wasn't scheduled on its rq.
> 			 */
> 			if (unlikely(task_rq(task) != rq ||
> 				     !cpumask_test_cpu(lowest_rq->cpu, &task->cpus_mask) ||
> 				     task_running(rq, task) ||
> 				     !rt_task(task) ||
> +				     is_migrate_disabled(task) ||
> 				     !task_on_rq_queued(task))) {
>

Yes, it's what I did in the V1 patch.
Link: https://lore.kernel.org/all/20220623182932.58589-1-schspa@gmail.com/

But I think it's not the best solution for this problem.
In these scenarios, we still have a chance to make the task run faster
by retrying to retry to push the currently running task on this CPU away.

There is more details on V2 patch's replay message.
Link: https://lore.kernel.org/all/CAMA88TrZ-o4W81Yfw9Wcs3ghoxwpeAKtFejtMTt78GNB0tKaSA@mail.gmail.com/#t

> 				double_unlock_balance(rq, lowest_rq);
> 				lowest_rq = NULL;
> 				break;
> 			}
> 		}
>
> -- Steve

-- 
BRs
Schspa Shi

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ