lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 26 Aug 2017 01:12:05 +0900
From:   Byungchul Park <max.byungchul.park@...il.com>
To:     Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc:     Borislav Petkov <bp@...en8.de>,
        Byungchul Park <byungchul.park@....com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Peter Zijlstra <peterz@...radead.org>,
        lkml <linux-kernel@...r.kernel.org>, kernel-team@....com
Subject: Re: WARNING: possible circular locking dependency detected

On Fri, Aug 25, 2017 at 11:47 PM, Sebastian Andrzej Siewior
<bigeasy@...utronix.de> wrote:
> On 2017-08-25 12:03:04 [+0200], Borislav Petkov wrote:
>> ======================================================
>> WARNING: possible circular locking dependency detected
>> 4.13.0-rc6+ #1 Not tainted
>> ------------------------------------------------------
>
> While looking at this, I stumbled upon another one also enabled by
> "completion annotation" in the TIP:
>
> | ======================================================
> | WARNING: possible circular locking dependency detected
> | 4.13.0-rc6-00758-gd80d4177391f-dirty #112 Not tainted
> | ------------------------------------------------------
> | cpu-off.sh/426 is trying to acquire lock:
> |  ((complete)&st->done){+.+.}, at: [<ffffffff810cb344>] takedown_cpu+0x84/0xf0
> |
> | but task is already holding lock:
> |  (sparse_irq_lock){+.+.}, at: [<ffffffff811220f2>] irq_lock_sparse+0x12/0x20
> |
> | which lock already depends on the new lock.
> |
> | the existing dependency chain (in reverse order) is:
> |
> | -> #1 (sparse_irq_lock){+.+.}:
> |        __mutex_lock+0x88/0x9a0
> |        mutex_lock_nested+0x16/0x20
> |        irq_lock_sparse+0x12/0x20
> |        irq_affinity_online_cpu+0x13/0xd0
> |        cpuhp_invoke_callback+0x4a/0x130
> |
> | -> #0 ((complete)&st->done){+.+.}:
> |        check_prev_add+0x351/0x700
> |        __lock_acquire+0x114a/0x1220
> |        lock_acquire+0x47/0x70
> |        wait_for_completion+0x5c/0x180
> |        takedown_cpu+0x84/0xf0
> |        cpuhp_invoke_callback+0x4a/0x130
> |        cpuhp_down_callbacks+0x3d/0x80
> …
> |
> | other info that might help us debug this:
> |
> |  Possible unsafe locking scenario:
> |        CPU0                    CPU1
> |        ----                    ----
> |   lock(sparse_irq_lock);
> |                                lock((complete)&st->done);
> |                                lock(sparse_irq_lock);
> |   lock((complete)&st->done);
> |
> |  *** DEADLOCK ***
>
> We hold the sparse_irq_lock lock while waiting for the completion in the
> CPU-down case and in the CPU-up case we acquire the sparse_irq_lock lock
> while the other CPU is waiting for the completion.
> This is not an issue if my interpretation of lockdep here is correct.

Hello Sebastian,

I think you parsed the message correctly.

The message is saying that, for example:

context A (maybe being up?)
--
lock(sparse_irq_lock) // wait for sparse_irq_lock in B to be released
complete(st->done) // impossible to hit here

context B (maybe wanting to synchronize with the cpu being up?)
--
lock(sparse_irq_lock) // acquired successfully
wait_for_completion(st->done) // wait for completion of st->done in A
unlock(sparse_irq_lock) // impossible to hit here

I cannot check the kernel code at the moment.. I wonder if this scenario is
impossible. Could you answer it?

-- 
Thanks,
Byungchul

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ