linux-kernel - Re: [PATCH] sched: fix tsk->pi_lock isn't held when do_set_cpus

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Date:	Fri, 28 Aug 2015 09:49:17 +0800
From:	Wanpeng Li <wanpeng.li@...mail.com>
To:	Peter Zijlstra <peterz@...radead.org>
CC:	Ingo Molnar <mingo@...nel.org>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] sched: fix tsk->pi_lock isn't held when
 do_set_cpus_allowed()

On 8/28/15 6:18 AM, Peter Zijlstra wrote:
> On Tue, Aug 25, 2015 at 07:47:44PM +0800, Wanpeng Li wrote:
>> On 8/25/15 6:32 PM, Peter Zijlstra wrote:
>>> So Possibly, Maybe (I'm still to wrecked to say for sure), something
>>> like this would work:
>>>
>>> 	WARN_ON(debug_locks && (lockdep_is_held(&p->pi_lock) ||
>>> 				(p->on_rq && lockdep_is_held(&rq->lock))));
>>>
>>> Instead of those two separate lockdep asserts.
>>>
>>> Please consider carefully.
> So the normal rules for changing task_struct::cpus_allowed are holding
> both pi_lock and rq->lock, such that holding either stabilizes the mask.
>
> This is so that wakeup can happen without rq->lock and load-balance
> without pi_lock.
>
>  From this we already get the relaxation that we can omit acquiring
> rq->lock if the task is not on the rq, because in that case
> load-balancing will not apply to it.
>
> ** these are the rules currently tested in do_set_cpus_allowed() **
>
> Now, since __set_cpus_allowed_ptr() uses task_rq_lock() which
> unconditionally acquires both locks, we could get away with holding just
> rq->lock when on_rq for modification because that'd still exclude
> __set_cpus_allowed_ptr(), it would also work against
> __kthread_bind_mask() because that assumes !on_rq.
>
> That said, this is all somewhat fragile.
>
>> Commit (5e16bbc2f: sched: Streamline the task migration locking a little)
>> won't hold the pi_lock in migrate_tasks() path any more, actually pi_lock
>> was still not held when call select_fallback_rq() and it was held in
>> __migrate_task() before  the commit. Then commit (25834c73f93: sched: Fix a
>> race between __kthread_bind() and sched_setaffinity()) add a
>> lockdep_assert_held() in do_set_cpus_allowed(), the bug is triggered. How
>> about something like below:
>>
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -5186,6 +5186,15 @@ static void migrate_tasks(struct rq *dead_rq)
>>                  BUG_ON(!next);
>>                  next->sched_class->put_prev_task(rq, next);
>>
>> +               raw_spin_unlock(&rq->lock);
>> +               raw_spin_lock(&next->pi_lock);
>> +               raw_spin_lock(&rq->lock);
>> +               if (!(task_rq(next) == rq && task_on_rq_queued(next))) {
>> +                       raw_spin_unlock(&rq->lock);
>> +                       raw_spin_unlock(&next->pi_lock);
>> +                       continue;
>> +               }
> Yeah, that's quite disgusting.. also you'll trip over the lockdep_pin if
> you were to actually run this.

Indeed. I will handle lockdep_pin in these codes if you choice the 
second fragile. :-)

Regards,
Wanpeng Li

>
> Now, I don't think dropping rq->lock is quite as disastrous as it
> usually is because !cpu_active at this point, which means load-balance
> will not interfere, but that too is somewhat fragile.
>
>
> So we end up with a choice of two fragile.. let me ponder that a wee
> bit more.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/