[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <517EC603.6010506@acm.org>
Date: Mon, 29 Apr 2013 21:12:03 +0200
From: Bart Van Assche <bvanassche@....org>
To: Neil Horman <nhorman@...driver.com>
CC: David Miller <davem@...emloft.net>, netdev@...r.kernel.org,
mingo@...hat.com
Subject: Re: [PATCH RFC] spinlock: split out debugging check from spin_lock_mutex
On 04/29/13 20:13, Neil Horman wrote:
> On Sat, Apr 27, 2013 at 08:53:54PM +0200, Bart Van Assche wrote:
>> Sorry Neil, but I can still trigger the CPU stuck messages with kernel v3.9-rc8-24-gd7d7271:
>>
>> kernel: BUG: soft lockup - CPU#0 stuck for 22s! [rs:main Q:Reg:601]
>> kernel: irq event stamp: 1999192
>> kernel: hardirqs last enabled at (1999191): [<ffffffff8103e89d>] console_unlock+0x41d/0x450
>> kernel: hardirqs last disabled at (1999192): [<ffffffff8143e96a>] apic_timer_interrupt+0x6a/0x80
>> kernel: softirqs last enabled at (1999188): [<ffffffff81044e26>] __do_softirq+0x196/0x280
>> kernel: softirqs last disabled at (1999181): [<ffffffff810450c5>] irq_exit+0xb5/0xc0
>> kernel: CPU 0
>> kernel: Pid: 601, comm: rs:main Q:Reg Tainted: G O 3.9.0-rc8-debug+ #1 System manufacturer P5Q DELUXE/P5Q DELUXE
>> kernel: RIP: 0010:[<ffffffff8103e8a0>] [<ffffffff8103e8a0>] console_unlock+0x420/0x450
>> kernel: Call Trace:
>> kernel: [<ffffffff812bbf37>] do_con_write.part.19+0x887/0x2040
>> kernel: [<ffffffff812a52b7>] ? process_output+0x37/0x70
>> kernel: [<ffffffff814316fc>] ? mutex_lock_nested+0x28c/0x350
>> kernel: [<ffffffff812a52b7>] ? process_output+0x37/0x70
>> kernel: [<ffffffff812bd764>] con_write+0x34/0x50
>> kernel: [<ffffffff812a51e9>] do_output_char+0x179/0x210
>> kernel: [<ffffffff812a52cd>] process_output+0x4d/0x70
>> kernel: [<ffffffff812a59d0>] n_tty_write+0x210/0x480
>> kernel: [<ffffffff81072710>] ? try_to_wake_up+0x2e0/0x2e0
>> kernel: [<ffffffff812a2839>] tty_write+0x159/0x300
>> kernel: [<ffffffff8109326f>] ? lock_release_holdtime.part.22+0xf/0x180
>> kernel: [<ffffffff812a57c0>] ? n_tty_poll+0x1c0/0x1c0
>> kernel: [<ffffffff81151a3b>] vfs_write+0xab/0x170
>> kernel: [<ffffffff81151ea5>] sys_write+0x55/0xa0
>> kernel: [<ffffffff8143dd82>] system_call_fastpath+0x16/0x1b
>
> So, looking at this, I'm not 100% sure whats going on. console_unlock has the
> ability to detect the need for rescheduling when we call
> raw_spin_unlock_irqrestore at the bottom of the function, so there should be no
> reason that we don't give the per-cpu watchdog task a chance to run. About the
> only case I see in which we might be in trouble would be if somehow got into an
> infinite loop if every message in the log buffer had LOG_NOCONS set, in which
> case we would spin in this loop without ever scheduling or re-enabling
> interrupts. But for that to happen I think you would have to be flooding the
> log with messages that were bigger than the default message buffer size (about
> 1000 bytes without a newline). You're not doing that are you?
The above call trace was triggered by sending messages at a high rate
over netconsole where each message has a length between 40 and (about)
160 characters. I'm not sure though what exactly is causing this call
trace - I haven't had the time yet to analyze this in detail.
Bart.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists