lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 8 Mar 2018 16:50:06 +0800
From:   Wen Yang <wen.yang99@....com.cn>
To:     jejb@...ux.vnet.ibm.com, martin.petersen@...cle.com
Cc:     linux-scsi@...r.kernel.org, linux-kernel@...r.kernel.org,
        jiang.biao2@....com.cn, zhong.weidong@....com.cn,
        wen.yang99@....com.cn
Subject: [PATCH] scsi: Replace sdev_printk with printk_deferred to avoid

When scsi disks went wrong frequently, and with serial console 
attached, tasks may be blocked in the following flow for more than 10s:
[  557.369580]  <<EOE>>  [<ffffffff81309336>] blkcg_print_blkgs+0x76/0xf0   ----》 wait for blkg->q->queue_lock
[  557.369581]  [<ffffffff8130f236>] cfqg_print_rwstat_recursive+0x36/0x40
[  557.369583]  [<ffffffff81109393>] cgroup_seqfile_show+0x73/0x80
[  557.369584]  [<ffffffff81222b57>] ? seq_buf_alloc+0x17/0x40
[  557.369585]  [<ffffffff8122305a>] seq_read+0x10a/0x3b0
[  557.369586]  [<ffffffff811fe9be>] vfs_read+0x9e/0x170
[  557.369587]  [<ffffffff811ff58f>] SyS_read+0x7f/0xe0
[  557.369588]  [<ffffffff81697ac9>] system_call_fastpath+0x16/0x1b
That's  because the serial is very slow, and the other task hold the
q->queue_lock for very long time, waiting for the serial finishing 
writing.
PID: 319    TASK: ffff881ffb09edd0  CPU: 7   COMMAND: "kworker/u113:1"
...
 #4 [ffff881ffb0b7540] delay_tsc at ffffffff81326724
 #5 [ffff881ffb0b7548] __const_udelay at ffffffff81326678
 #6 [ffff881ffb0b7558] wait_for_xmitr at ffffffff814056e0
 #7 [ffff881ffb0b7580] serial8250_console_putchar at ffffffff814058ac
 #8 [ffff881ffb0b75a0] uart_console_write at ffffffff8140035a
 #9 [ffff881ffb0b75d0] serial8250_console_write at ffffffff814057fe
 #10 [ffff881ffb0b7618] call_console_drivers.constprop.17 at ffffffff81087011
 #11 [ffff881ffb0b7640] console_unlock at ffffffff810889e9
 #12 [ffff881ffb0b7680] vprintk_emit at ffffffff81088df4
 #13 [ffff881ffb0b76f0] dev_vprintk_emit at ffffffff81428e72
 #14 [ffff881ffb0b77a8] dev_printk_emit at ffffffff81428eee
 #15 [ffff881ffb0b7808] __dev_printk at ffffffff8142937e
 #16 [ffff881ffb0b7818] dev_printk at ffffffff8142942d
 #17 [ffff881ffb0b7888] sdev_prefix_printk at ffffffff81463771
 #18 [ffff881ffb0b7918] scsi_prep_state_check at ffffffff814598e4
 #19 [ffff881ffb0b7928] scsi_prep_fn at ffffffff8145992d
 #20 [ffff881ffb0b7960] blk_peek_request at ffffffff812f0826
 #21 [ffff881ffb0b7988] scsi_request_fn at ffffffff8145b588
 #22 [ffff881ffb0b79f0] __blk_run_queue at ffffffff812ebd63
 #23 [ffff881ffb0b7a08] blk_queue_bio at ffffffff812f1013        -----》acquired q->queue_lock and wait for console_write to finish
 #24 [ffff881ffb0b7a50] generic_make_request at ffffffff812ef209
 #25 [ffff881ffb0b7a98] submit_bio at ffffffff812ef351
 #26 [ffff881ffb0b7af0] xfs_submit_ioend_bio at ffffffffa0146a63 [xfs]
 #27 [ffff881ffb0b7b00] xfs_submit_ioend at ffffffffa0146b31 [xfs]
 #28 [ffff881ffb0b7b40] xfs_vm_writepages at ffffffffa0146e18 [xfs]
 #29 [ffff881ffb0b7bb8] do_writepages at ffffffff8118da6e

commit e480af09c49736848f749a43dff2c902104f6691 avoided the watchdog
trigger, but could not avoid tasks being blocked for long time.

This patch replacing the sdev_printk with async printk_deferred can
avoid task blocking because of the slow serial and unstable disks in
such senario. 

Signed-off-by: Wen Yang <wen.yang99@....com.cn>
Signed-off-by: Jiang Biao <jiang.biao2@....com.cn>
---
 drivers/scsi/scsi_lib.c | 18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index a86df9c..2bbe34c 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1297,8 +1297,10 @@ static int scsi_setup_cmnd(struct scsi_device *sdev, struct request *req)
 			 * commands.  The device must be brought online
 			 * before trying any recovery commands.
 			 */
-			sdev_printk(KERN_ERR, sdev,
-				    "rejecting I/O to offline device\n");
+			printk_deferred(KERN_ERR
+					"%s %s: rejecting I/O to offline device\n",
+					dev_driver_string(&sdev->sdev_gendev),
+					dev_name(&sdev->sdev_gendev));
 			ret = BLKPREP_KILL;
 			break;
 		case SDEV_DEL:
@@ -1306,8 +1308,10 @@ static int scsi_setup_cmnd(struct scsi_device *sdev, struct request *req)
 			 * If the device is fully deleted, we refuse to
 			 * process any commands as well.
 			 */
-			sdev_printk(KERN_ERR, sdev,
-				    "rejecting I/O to dead device\n");
+			printk_deferred(KERN_ERR
+					"%s %s: rejecting I/O to dead device\n",
+					dev_driver_string(&sdev->sdev_gendev),
+					dev_name(&sdev->sdev_gendev));
 			ret = BLKPREP_KILL;
 			break;
 		case SDEV_BLOCK:
@@ -1798,8 +1802,10 @@ static void scsi_request_fn(struct request_queue *q)
 			break;
 
 		if (unlikely(!scsi_device_online(sdev))) {
-			sdev_printk(KERN_ERR, sdev,
-				    "rejecting I/O to offline device\n");
+			printk_deferred(KERN_ERR
+					"%s %s: rejecting I/O to offline device\n",
+					dev_driver_string(&sdev->sdev_gendev),
+					dev_name(&sdev->sdev_gendev));
 			scsi_kill_request(req, q);
 			continue;
 		}
-- 
1.8.3.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ