[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <p73y7ccwzlb.fsf@bingen.suse.de>
Date: Sun, 02 Dec 2007 22:34:08 +0100
From: Andi Kleen <andi@...stfloor.org>
To: Ingo Molnar <mingo@...e.hu>
Cc: Andi Kleen <andi@...stfloor.org>,
Arjan van de Ven <arjan@...radead.org>,
linux-kernel@...r.kernel.org,
Andrew Morton <akpm@...ux-foundation.org>,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [feature] automatically detect hung TASK_UNINTERRUPTIBLE tasks
Ingo Molnar <mingo@...e.hu> writes:
>
> do you realize that more than 120 seconds TASK_UNINTERRUPTIBLE _is_
> something that most humans consider as "buggy" in the overwhelming
> majority of cases, regardless of the reason? Yes, there are and will be
> some exceptions, but not nearly as countless as you try to paint it. A
> quick test in the next -mm will give us a good idea about the ratio of
> false positives.
That would assume error paths get regularly exercised in -mm.
Doubtful. Most likely we'll only hear about it after it's
out in the wild on some bigger release.
The problem I have with your patch is that it will mess up Linux (in
particular block/network file system) error handling even more than it
already is. In error handling cases such "unusual" things happen
frequently unfortunately.
I used to fight with this with the NMI watchdog on on x86-64 -- it
tended to trigger regularly on SCSI error handlers for example
disabling interrupts too long while handling the error. They
eventually got all fixed, but with that change they will likely
all start throwing nasty messages again.
And usually it is not simply broken code neither but really
doing something difficult.
-Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists