[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <467303C9.9030706@redhat.com>
Date: Fri, 15 Jun 2007 17:25:29 -0400
From: Chuck Ebbert <cebbert@...hat.com>
To: Miklos Szeredi <miklos@...redi.hu>
CC: mingo@...e.hu, chris@...ee.ca, linux-kernel@...r.kernel.org,
tglx@...utronix.de
Subject: Re: [BUG] long freezes on thinkpad t60
On 06/14/2007 12:04 PM, Miklos Szeredi wrote:
> I've got some more info about this bug. It is gathered with
> nmi_watchdog=2 and a modified nmi_watchdog_tick(), which instead of
> calling die_nmi() just prints a line and calls show_registers().
>
> This makes the machine actually survive the NMI tracing. The attached
> traces are gathered over about an hour of stressing. An mp3 player is
> also going on continually, and I can hear a couple of seconds of
> "looping" quite often, but it gets as far as the NMI trace only
> rarely. AFAICS only the last pair shows a trace for both CPUs during
> the same "freeze".
>
> I've put some effort into understanding what's going on, but I'm not
> familiar with how interrupts work and that sort of thing.
>
> The pattern that emerges is that on CPU0 we have an interrupt, which
> is trying to acquire the rq lock, but can't.
>
> On CPU1 we have strace which is doing wait_task_inactive(), which sort
> of spins acquiring and releasing the rq lock. I've checked some of
> the traces and it is just before acquiring the rq lock, or just after
> releasing it, but is not actually holding it.
>
> So is it possible that wait_task_inactive() could be starving the
> other waiters of the rq spinlock? Any ideas?
Spinlocks aren't fair, so this kind of problem is always a possibility.
I think maybe we need another kind of unlock that gives another processor
a fair chance at the lock. Some things you could try to see if they help:
- add smp_mb() after the unlock
- replace cpu_relax() with usleep()
- use an xchcg instruction to do the unlock, like i386 does when
CONFIG_X86_OOSTORE is set
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists