lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <94D0CD8314A33A4D9D801C0FE68B402958C1A9E5@G9W0745.americas.hpqcorp.net>
Date:	Fri, 22 Aug 2014 00:39:59 +0000
From:	"Elliott, Robert (Server Storage)" <Elliott@...com>
To:	Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@...achi.com>,
	Hannes Reinecke <hare@...e.de>
CC:	"Martin K. Petersen" <martin.petersen@...cle.com>,
	"linux-scsi@...r.kernel.org" <linux-scsi@...r.kernel.org>,
	"yrl.pp-manager.tt@...achi.com" <yrl.pp-manager.tt@...achi.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"James E.J. Bottomley" <JBottomley@...allels.com>,
	Masami Hiramatsu <masami.hiramatsu.pt@...achi.com>,
	Doug Gilbert <dgilbert@...erlog.com>,
	Hidehiro Kawai <hidehiro.kawai.ez@...achi.com>,
	Christoph Hellwig <hch@....de>
Subject: RE: [RFC PATCH -logging 00/10] scsi/constants: Output continuous
 error messages on trace

> -----Original Message-----
> From: linux-scsi-owner@...r.kernel.org [mailto:linux-scsi-
> owner@...r.kernel.org] On Behalf Of Yoshihiro YUNOMAE
> Sent: Friday, 08 August, 2014 6:50 AM
> Subject: [RFC PATCH -logging 00/10] scsi/constants: Output continuous
> error messages on trace
...
> 1) printk
> Keeps current implemntation of upstream kernel.
> The messages are divided and can be mixed, but all users can
> check the error messages without any settings.

scsi_io_completion ignore the scsi_logging_level and always calls
printk if it detects ACTION_FAIL, resulting in messages like:

    [10240.338600] sd 2:0:0:0: [sdr]
    [10240.339722] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
    [10240.341662] sd 2:0:0:0: [sdr]
    [10240.342792] Sense Key : Hardware Error [current]
    [10240.344575] sd 2:0:0:0: [sdr]
    [10240.345653] Add. Sense: Logical unit failure
    [10240.347138] sd 2:0:0:0: [sdr] CDB:
    [10240.348309] Read(10): 28 00 00 00 00 80 00 00 08 00

If you trigger hundreds of errors (e.g., hot remove a device
during heavy IO), then all the prints to the linux serial console
bog down the system, causing timeouts in commands to other
devices and soft lockups for applications.

Some changes that would help are:
1. Put them under SCSI logging level control
2. Use printk_ratelimited so an excessive number are trimmed

Would you like to include something like this in your
patch set?

This is an example patch that only prints them if the MLCOMPLETE 
logging level is nonzero.
Off: scsi_logging_level --set --mlcomplete=0
On: scsi_logging_level --set --mlcomplete=1

Some other loglevel (e.g., ERROR_RECOVERY) could be used.

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index d6b4ea8..dbb601f 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1037,7 +1037,9 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
 	switch (action) {
 	case ACTION_FAIL:
 		/* Give up and fail the remainder of the request */
-		if (!(req->cmd_flags & REQ_QUIET)) {
+		if (!(req->cmd_flags & REQ_QUIET) &&
+		    SCSI_LOG_LEVEL(SCSI_LOG_MLCOMPLETE_SHIFT,
+		    SCSI_LOG_MLCOMPLETE_BITS)) {
 			scsi_print_result(cmd);
 			if (driver_byte(result) & DRIVER_SENSE)
 				scsi_print_sense("", cmd);

Converting to printk_ratelimited is harder since the prints
are spread out over three functions (and as your patch
series notes, many individual printk calls).  The rates
for the printk calls might not match, which would lead to
even more confusing output.

---
Rob Elliott    HP Server Storage



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ