lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130304054254.GA21630@localhost>
Date:	Mon, 4 Mar 2013 13:42:54 +0800
From:	Fengguang Wu <fengguang.wu@...el.com>
To:	Jan Kara <jack@...e.cz>
Cc:	Johannes Weiner <hannes@...xchg.org>, linux-kernel@...r.kernel.org
Subject: [printk A-A deadlock] possible reason: unannotated irqs-on

Greetings,

I got the below oops and the first bad commit is

commit 0c2e1aacdc87c3ecf16cf26e8b3476c6203d03e2
Author: Jan Kara <jack@...e.cz>
Date:   Sat Mar 2 00:02:38 2013 +0000

    printk: avoid softlockups in console_unlock()
    
    A CPU can be caught in console_unlock() for a long time (tens of seconds
    are reported by our customers) when other CPUs are using printk heavily
    and serial console makes printing slow.  Despite serial console drivers
    are calling touch_nmi_watchdog() this triggers softlockup warnings because
    interrupts are effectively disabled for the whole time printing takes
    place.  Thus IPIs cannot be processed and other CPUs get stuck spinning in
    calls like smp_call_function_many().  Also RCU eventually starts reporting
    lockups.
    
    In my artifical testing I also managed to trigger a situation when disk
    disappeared from the system apparently because commands to / from it could
    not be delivered for long enough.  This is why just silencing watchdogs
    isn't a reliable solution to the problem.
    
    One part of fixing the issue is changing vprintk_emit() to call
    console_unlock() with interrupts enabled (this isn't perfect as printk()
    itself can be called with interrupts disabled but it improves the
    situation in lots of cases).  Another part is limiting the time we spend
    in console_unlock() printing loop to watchdog_thresh() / 4.  Then we
    release console_sem and wait for watchdog_thresh() / 4 to give a chance to
    other printk() users to get the semaphore and start printing.  If printk()
    was called with interrupts enabled, it also gives CPU a chance to process
    blocked interrupts.  Then we recheck if there's still anything to print,
    try to grab console_sem again and if we succeed, we go on with printing.
    
    Signed-off-by: Jan Kara <jack@...e.cz>
    Cc: "Paul E. McKenney" <paulmck@...ibm.com>
    Cc: Steven Rostedt <rostedt@...dmis.org>
    Cc: Ingo Molnar <mingo@...e.hu>
    Cc: Peter Zijlstra <a.p.zijlstra@...llo.nl>
    Cc: Frederic Weisbecker <fweisbec@...il.com>
    Signed-off-by: Andrew Morton <akpm@...ux-foundation.org>

[    0.000000]                      A-A deadlock:possible reason: unannotated irqs-on.
[    0.000000] irq event stamp: 378
[    0.000000] hardirqs last  enabled at (377): [<ffffffff810bb15c>] debug_check_no_locks_freed+0x1f8/0x224
[    0.000000] hardirqs last disabled at (378): [<ffffffff8104ae9c>] vprintk_emit+0xb3/0x88f
[    0.000000] softirqs last  enabled at (0): [<          (null)>]           (null)
[    0.000000] softirqs last disabled at (0): [<          (null)>]           (null)

git bisect start 1eea8bb261e8d43492c91d516d97cf96cdac7a3b 19f949f52599ba7c3f67a5897ac6be14bfcb1200 --
git bisect good 09bca78e81d0ba702b601e09fd887e20c32fc7af  #  1000  2013-03-02 15:21:54  block: restore /proc/partitions to not display non-partitionable removable devices
git bisect good b4a2adc2b273b37274b5a74fc0995927f44fe773  #  1001  2013-03-02 18:55:30  mm/dmapool.c: fix null dev in dma_pool_create()
git bisect  bad 0c2e1aacdc87c3ecf16cf26e8b3476c6203d03e2  #     0  2013-03-02 18:59:08  printk: avoid softlockups in console_unlock()
git bisect good 641feee3e6eeb680efcf5c6f625bb2cb4fcfc52c  #  1000  2013-03-02 22:33:52  early_printk: consolidate random copies of identical code
git bisect good 8a7d64347f4144f2e93745ba971809ca843887ae  #  1002  2013-03-03 02:13:19  include/linux/fs.h: disable preempt when acquire i_size_seqcount write lock
git bisect good 6e1078c01b2e145140e7e8ce341c427b8be1eec3  #  1000  2013-03-03 05:52:33  kernel/smp.c: cleanups
git bisect good 6e1078c01b2e145140e7e8ce341c427b8be1eec3  #  3005  2013-03-03 16:30:52  kernel/smp.c: cleanups
git bisect  bad 67e51ff72caec0de0c3673303c0e7ee935ef80d8  #     0  2013-03-03 16:34:47  add a refcount check in dput()
git bisect good 52efcaa3b551117572315f73c5e787f65c734493  #  3002  2013-03-04 02:57:11  Revert "printk: avoid softlockups in console_unlock()"
git bisect good 106edea2fe051df65a1a6231e9ffa2876cc391cc  #  3001  2013-03-04 13:34:26  Add linux-next specific files for 20130301

Thanks,
Fengguang

View attachment "dmesg-kvm-waimea-6943-2013-03-02-09-24-46-3.8.0-mm1-00066-g1eea8bb-252" of type "text/plain" (70450 bytes)

View attachment "1eea8bb261e8d43492c91d516d97cf96cdac7a3b-bisect.log" of type "text/plain" (30510 bytes)

View attachment ".config-bisect" of type "text/plain" (50672 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ