[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20130130160827.cadb3262.akpm@linux-foundation.org>
Date: Wed, 30 Jan 2013 16:08:27 -0800
From: Andrew Morton <akpm@...ux-foundation.org>
To: Jan Kara <jack@...e.cz>
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
LKML <linux-kernel@...r.kernel.org>, jslaby@...e.cz
Subject: Re: [PATCH] printk: Avoid softlockups in console_unlock()
On Tue, 29 Jan 2013 15:54:24 +0100
Jan Kara <jack@...e.cz> wrote:
> > So I was testing the attached patch which does what we discussed. The bad
> > news is I was able to trigger a situation (twice) when suddently sda
> > disappeared and thus all IO requests failed with EIO. There is no trace of
> > what's happened in the kernel log. I'm guessing that disabled interrupts on
> > the printing CPU caused scsi layer to time out for some request and fail the
> > device. So where do we go from here?
> Andrew? I guess this fell off your radar via the "hrm, strange, need to
> have a closer look later" path?
urgh. I was hoping that if we left it long enough, one of both of us
would die :(
I fear we will rue the day when we changed printk() to bounce some of
its work up to a kernel thread.
> Currently I'd be inclined to return to my original solution...
Can we make it smarter? Say, take a peek at the current
softlockup/nmi-watchdog intervals, work out how for how long we can
afford to keep interrupts disabled and then use that period and
sched_clock() to work out if we're getting into trouble? IOW, remove
the hard-wired "1000" thing which will always be too high or too low
for all situations.
Implementation-wise, that would probably end up adding a kernel-wide
function along the lines of
/*
* Return the maximum number of nanosecond for which interrupts may be disabled
* on the current CPU
*/
u64 max_interrupt_disabled_duration(void)
{
return min(sortirq duration, nmi watchdog duration);
}
Thinking ahead...
Other kernel sites which know they can disable interrupts for a long
time can perhaps use this.
Later, realtimeish systems (for example machine controllers) might want
to add a kernel tunable so they can set the
max_interrupt_disabled_duration() return value much lower.
To make that more accurate, we could add per-cpu, per-irq variables to
record sched_clock() when each CPU enters the interrupt, so the comment
becomes
/*
* Return the remaining maximum number of nanosecond for which interrupts may
* be disabled on the current CPU
*/
This may all be crazy and hopefully we'll never do it, but the design
should permit such things from day one if practical.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists