[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20130712024412.GA23785@thunk.org>
Date: Thu, 11 Jul 2013 22:44:12 -0400
From: Theodore Ts'o <tytso@....edu>
To: Anatol Pomazau <anatol@...gle.com>
Cc: linux-ext4@...r.kernel.org,
Anatol Pomozov <anatol.pomozov@...il.com>
Subject: Re: ext4: Rate limit printk in buffer_io_error()
On Tue, Jul 09, 2013 at 04:01:38PM -0700, Anatol Pomazau wrote:
> From: Anatol Pomozov <anatol.pomozov@...il.com>
>
> If there are a lot of outstanding buffered IOs when a device is
> taken offline (due to hardware errors etc), ext4_end_bio prints
> out a message for each failed logical block. While this is desirable,
> we see thousands of such lines being printed out before the
> serial console gets overwhelmed, causing ext4_end_bio() wait for
> the printk to complete.
>
> This in itself isn't a disaster, except for the detail that this
> function is being called with the queue lock held.
> This causes any other function in the block layer
> to spin on its spin_lock_irqsave while the serial console is
> draining. If NMI watchdog is enabled on this machine then it
> eventually comes along and shoots the machine in the head.
>
> The end result is that losing any one disk causes the machine to
> go down. This patch rate limits the printk to bandaid around the
> problem.
>
> Tested: xfstests
> Change-Id: I8ab5690dcf4f3a67e78be147d45e489fdf4a88d8
> Signed-off-by: Anatol Pomozov <anatol.pomozov@...il.com>
Thanks, applied.
- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists