linux-kernel - Re: CPU softlockup due to smp_call

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CA+1xoqeeagx=OwV2XFQYb=ATOGtW0Zpyx6BOcVbu3hg+eiMqSA@mail.gmail.com>
Date:	Thu, 5 Apr 2012 14:32:27 +0200
From:	Sasha Levin <levinsasha928@...il.com>
To:	Avi Kivity <avi@...hat.com>
Cc:	Peter Zijlstra <peterz@...radead.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...e.hu>, "H. Peter Anvin" <hpa@...or.com>,
	Dave Jones <davej@...hat.com>, kvm@...r.kernel.org,
	"linux-kernel@...r.kernel.org List" <linux-kernel@...r.kernel.org>
Subject: Re: CPU softlockup due to smp_call_function()

On Thu, Apr 5, 2012 at 2:24 PM, Avi Kivity <avi@...hat.com> wrote:
> On 04/04/2012 11:12 PM, Sasha Levin wrote:
>> Hi all,
>>
>> I've starting seeing soft lockups resulting from smp_call_function()
>> calls. I've attached two different backtraces of this happening with
>> different code paths.
>>
>> This is running inside a KVM guest with the trinity fuzzer, using
>> today's linux-next kernel.
>>
>> [ 6540.134009] BUG: soft lockup - CPU#0 stuck for 22s! [kworker/u:1:38]
>> [ 6540.134048] irq event stamp: 286811770
>> [ 6540.134048] hardirqs last  enabled at (286811769):
>> [<ffffffff82669e74>] restore_args+0x0/0x30
>> [ 6540.134048] hardirqs last disabled at (286811770):
>> [<ffffffff8266b3ea>] apic_timer_interrupt+0x6a/0x80
>> [ 6540.134048] softirqs last  enabled at (286811768):
>> [<ffffffff810b746e>] __do_softirq+0x16e/0x190
>> [ 6540.134048] softirqs last disabled at (286811749):
>> [<ffffffff8266bdec>] call_softirq+0x1c/0x30
>> [ 6540.134048] CPU 0
>> [ 6540.134048] Pid: 38, comm: kworker/u:1 Tainted: G        W
>> 3.4.0-rc1-next-20120404-sasha-dirty #72
>> [ 6540.134048] RIP: 0010:[<ffffffff8111f30e>]  [<ffffffff8111f30e>]
>> smp_call_function_many+0x27e/0x2a0
>>
>
> This cpu is waiting for some other cpu to process a function (likely
> rps_trigger_softirq(), from the trace).  Can you get a backtrace on all
> cpus when this happens?
>
> It would be good to enhance smp_call_function_*() to do this
> automatically when it happens - it's spinning there anyway, so it might
> as well count the iterations and NMI the lagging cpu if it waits for too
> long.

What do you think about modifying the softlockup detector to NMI all
CPUs if it's going to panic because it detected a lockup?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/