lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Tue, 17 Nov 2020 11:29:18 +0000
From:   Valentin Schneider <valentin.schneider@....com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Oleksandr Natalenko <oleksandr@...alenko.name>,
        linux-rt-users@...r.kernel.org, linux-kernel@...r.kernel.org,
        tglx@...utronix.de, rostedt@...dmis.org
Subject: Re: WARNING at kernel/sched/core.c:2013 migration_cpu_stop+0x2e3/0x330


On 17/11/20 11:06, Peter Zijlstra wrote:
> On Mon, Nov 16, 2020 at 10:00:14AM +0000, Valentin Schneider wrote:
>> 
>> On 15/11/20 22:32, Oleksandr Natalenko wrote:
>> > Hi.
>> >
>> > I'm running v5.10-rc3-rt7 for some time, and I came across this splat in 
>> > dmesg:
>> >
>> > ```
>> > [118769.951010] ------------[ cut here ]------------
>> > [118769.951013] WARNING: CPU: 19 PID: 146 at kernel/sched/core.c:2013 
>> 
>> Err, I didn't pick up on this back then, but isn't that check bogus? If the
>> task is enqueued elsewhere, it's valid for it not to be affined
>> 'here'. Also that is_migration_disabled() check within is_cpu_allowed()
>> makes me think this isn't the best thing to call on a remote task.
>> 
>> ---
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index 1218f3ce1713..47d5b677585f 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -2010,7 +2010,7 @@ static int migration_cpu_stop(void *data)
>>  		 * valid again. Nothing to do.
>>  		 */
>>  		if (!pending) {
>> -			WARN_ON_ONCE(!is_cpu_allowed(p, cpu_of(rq)));
>> +			WARN_ON_ONCE(!cpumask_test_cpu(task_cpu(p), p->cpus_ptr));
>
> Ho humm.. bit of a mess that. I'm trying to figure out if we need that
> is_per_cpu_kthread() test here or not.
>
> I suppose not, what we want here is to ensure the CPU is in cpus_mask
> and not care about the whole hotplug mess.
>

That was my thought as well. On top of that, is_cpu_allowed(p) does a
p->migration_disabled read, which isn't so great in the remote case.

> Would it makes sense to replace both instances in migration_cpu_stop()
> with:
>
> 	WARN_ON_ONCE(!cpumask_test_cpu(task_cpu(p), p->cpus_mask));
>
> ?

I guess so; I was trying to see if we could factorize this, but stopped
mid-swing as I'm really wary of shuffling too much of this code (even with
the help of TLA+; well, maybe *because* of it).

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ