lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <76169a43-cda0-177a-2b1f-7dcdad900935@arm.com>
Date:   Wed, 13 Jul 2022 11:43:57 +0200
From:   Dietmar Eggemann <dietmar.eggemann@....com>
To:     Schspa Shi <schspa@...il.com>
Cc:     mingo@...hat.com, peterz@...radead.org, juri.lelli@...hat.com,
        vincent.guittot@...aro.org, rostedt@...dmis.org,
        bsegall@...gle.com, mgorman@...e.de, bristot@...hat.com,
        vschneid@...hat.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v6 1/2] sched/rt: fix bad task migration for rt tasks

On 12/07/2022 17:35, Schspa Shi wrote:
> 
> Dietmar Eggemann <dietmar.eggemann@....com> writes:
> 
>> On 12/07/2022 17:05, Schspa Shi wrote:

[...]

>> What code-base is this?
> 
> This is the logs from 5.10.59-rt
> Link: https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git
> v5.10.59-rt52 (9007b684f615750b0ee4ec57b5e547a4bf4a223e).

Thanks.

>> IMHO, currently this `WARN_ON_ONCE(is_migration_disabled(p))` in
>> set_task_cpu() is at > line 3000.
>>
> 
> But the master code have this BUG too.

I see. It's just that need_to_push in task_woken_rt() triggers
push_rt_tasks() much more often on preempt-rt.

[...]

>>> diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
>>> index 8c9ed96648409..7bd3e6ecbe45e 100644
>>> --- a/kernel/sched/rt.c
>>> +++ b/kernel/sched/rt.c
>>> @@ -1998,11 +1998,15 @@ static struct rq *find_lock_lowest_rq(struct task_struct *task, struct rq *rq)
>>>  			 * the mean time, task could have
>>>  			 * migrated already or had its affinity changed.
>>>  			 * Also make sure that it wasn't scheduled on its rq.
>>> +			 * It is possible the task was scheduled, set
>>> +			 * "migrate_disabled" and then got preempted, so we must
>>> +			 * check the task migration disable flag here too.
>>>  			 */
>>>  			if (unlikely(task_rq(task) != rq ||
>>>  				     !cpumask_test_cpu(lowest_rq->cpu, &task->cpus_mask) ||
>>>  				     task_running(rq, task) ||
>>>  				     !rt_task(task) ||
>>> +				     is_migration_disabled(task) ||
>>
>> I wonder why this isn't covered by `task_rq(task) != rq` in this condition?
>>
> 
> It's because thie task is not migrated, it just get scheduled and
> calling migrate_disable(); and then got preempted by it's CPU core
> before enable migrate_enable(). the task_rq not changed in this
> scenarios.

Yes, get it now. Essentially we need `current CPU (CPU0) != rq->cpu
(CPU1)`. Now I see that you had the discussion with Steven already on v3 ;-)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ