lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d3184f0b64e877cb37968c4499aadb94@codeaurora.org>
Date:   Wed, 27 Jun 2018 07:07:50 -0700
From:   Sodagudi Prasad <psodagud@...eaurora.org>
To:     Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc:     "Isaac J. Manjarres" <isaacm@...eaurora.org>, peterz@...radead.org,
        matt@...eblueprint.co.uk, mingo@...nel.org, tglx@...utronix.de,
        gregkh@...uxfoundation.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] stop_machine: Remove cpu swap from stop_two_cpus

On 2018-06-27 00:15, Sebastian Andrzej Siewior wrote:
> On 2018-06-26 14:28:26 [-0700], Isaac J. Manjarres wrote:
>> Remove CPU ID swapping in stop_two_cpus() so that the
>> source CPU's stopper thread is added to the wake queue last,
>> so that the source CPU's stopper thread is woken up last,
>> ensuring that all other threads that it depends on are woken
>> up before it runs.
> 
> You can't do that because you could deadlock while locking the stoper
> lock.
<Prasad> Without this change boot up issues are observed with Linux 
4.14.52.
One of the core is executing the stopper thread after wake_up_q() in
cpu_stop_queue_two_works() function, without waking up other cores 
stopper thread.
We see this issue 100% on device boot up with Linux 4.14.52.

Could you please explain bit more how the deadlock occurs?
static int cpu_stop_queue_two_works(int cpu1, struct cpu_stop_work 
*work1,
                                     int cpu2, struct cpu_stop_work 
*work2)
{
         struct cpu_stopper *stopper1 = per_cpu_ptr(&cpu_stopper, cpu1);
         struct cpu_stopper *stopper2 = per_cpu_ptr(&cpu_stopper, cpu2);
         DEFINE_WAKE_Q(wakeq);
         int err;
retry:
         raw_spin_lock_irq(&stopper1->lock);
         raw_spin_lock_nested(&stopper2->lock, SINGLE_DEPTH_NESTING);

<SNIP>

I think, you are suggesting to switch the locking sequence too.
stopper2->lock and stopper1->lock.

could you please share the test case to stress this code flow?

> Couldn't you swap cpu1+cpu2 and work1+work2?
<Prasad> Work1 and work2 are having same data contents.


> 
>> diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
>> index f89014a..d10d633 100644
>> --- a/kernel/stop_machine.c
>> +++ b/kernel/stop_machine.c
>> @@ -307,8 +307,6 @@ int stop_two_cpus(unsigned int cpu1, unsigned int 
>> cpu2, cpu_stop_fn_t fn, void *
>>  	cpu_stop_init_done(&done, 2);
>>  	set_state(&msdata, MULTI_STOP_PREPARE);
>> 
>> -	if (cpu1 > cpu2)
>> -		swap(cpu1, cpu2);
>>  	if (cpu_stop_queue_two_works(cpu1, &work1, cpu2, &work2))
>>  		return -ENOENT;
>> 
> 
> Sebastian

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora 
Forum,
Linux Foundation Collaborative Project

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ