lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4C9D13B6.5080404@linux.vnet.ibm.com>
Date:	Fri, 24 Sep 2010 16:10:14 -0500
From:	Brian King <brking@...ux.vnet.ibm.com>
To:	"Nicholas A. Bellinger" <nab@...ux-iscsi.org>
CC:	linux-scsi <linux-scsi@...r.kernel.org>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	Vasu Dev <vasu.dev@...ux.intel.com>,
	Tim Chen <tim.c.chen@...ux.intel.com>,
	Andi Kleen <ak@...ux.intel.com>,
	Matthew Wilcox <willy@...ux.intel.com>,
	James Bottomley <James.Bottomley@...e.de>,
	Mike Christie <michaelc@...wisc.edu>,
	James Smart <james.smart@...lex.com>,
	Andrew Vasquez <andrew.vasquez@...gic.com>,
	FUJITA Tomonori <fujita.tomonori@....ntt.co.jp>,
	Hannes Reinecke <hare@...e.de>,
	Joe Eykholt <jeykholt@...co.com>,
	Christoph Hellwig <hch@....de>,
	MPTFusionLinux <DL-MPTFusionLinux@....com>,
	"eata.c maintainer" <dario.ballabio@...ind.it>
Subject: Re: [RFC v3 01/15] scsi: Drop struct Scsi_Host->host_lock usage in
 scsi_dispatch_cmd()

On 09/24/2010 03:44 PM, Nicholas A. Bellinger wrote:
> On Fri, 2010-09-24 at 08:41 -0500, Brian King wrote:
>> On 09/23/2010 06:37 PM, Nicholas A. Bellinger wrote:
>>> @@ -651,7 +655,6 @@ static inline void scsi_cmd_get_serial(struct Scsi_Host *host, struct scsi_cmnd
>>>  int scsi_dispatch_cmd(struct scsi_cmnd *cmd)
>>>  {
>>>  	struct Scsi_Host *host = cmd->device->host;
>>> -	unsigned long flags = 0;
>>>  	unsigned long timeout;
>>>  	int rtn = 0;
>>>
>>> @@ -736,15 +739,11 @@ int scsi_dispatch_cmd(struct scsi_cmnd *cmd)
>>>  		scsi_done(cmd);
>>>  		goto out;
>>>  	}
>>> -
>>> -	spin_lock_irqsave(host->host_lock, flags);
>>>  	/*
>>> -	 * AK: unlikely race here: for some reason the timer could
>>> -	 * expire before the serial number is set up below.
>>> -	 *
>>> -	 * TODO: kill serial or move to blk layer
>>> +	 * Note that scsi_cmd_get_serial() used to be called here, but
>>> +	 * now we expect the legacy SCSI LLDs that actually need this
>>> +	 * to call it directly within their SHT->queuecommand() caller.
>>>  	 */
>>> -	scsi_cmd_get_serial(host, cmd); 
>>>
>>>  	if (unlikely(host->shost_state == SHOST_DEL)) {
>>>  		cmd->result = (DID_NO_CONNECT << 16);
>>> @@ -753,7 +752,7 @@ int scsi_dispatch_cmd(struct scsi_cmnd *cmd)
>>>  		trace_scsi_dispatch_cmd_start(cmd);
>>>  		rtn = host->hostt->queuecommand(cmd, scsi_done);
>>>  	}
>>> -	spin_unlock_irqrestore(host->host_lock, flags);
>>> +
>>>  	if (rtn) {
>>>  		trace_scsi_dispatch_cmd_error(cmd, rtn);
>>>  		if (rtn != SCSI_MLQUEUE_DEVICE_BUSY &&
>>
>> Are you planning a future revision that moves the acquiring of the host lock
>> into the LLDD's queuecommand for all the other drivers you don't currently
>> touch in this patch set?
>>
> 
> Hi Brian,
> 
> I was under the impression that this would be unnecessary for the vast
> majority of existing LLD drivers, but if you are aware of specific LLDs
> that would still need the struct Scsi_Host->host_lock held in their
> SHT->queuecommand() for whaterver reason please let me know and I would
> be happy to include this into an RFCv4.

I would think that most drivers might have issues without some pretty careful
auditing. When Christoph did this for the EH handlers, the first step was to
simply move acquiring the host lock into the LLDs. That way we can optimize
drivers one at a time after ensuring they can run lockless in their queuecommand
handler.

A couple examples of possible issues with drivers I'm familiar with (ibmvfc, ipr):

* Some drivers will do list manipulation for resources needed to send commands. If
  done lockless, this could result in list corruption with multiple readers/writers.

* Some drivers check the state of the hardware before sending a command. Failing to
  do this when the hardware is being reset may result in nasty things like PCI bus
  errors or even sending a command to the wrong device.

These are all the sorts of errors that will be very difficult to hit but have
pretty bad consequence when they are hit.

Thanks,

Brian

-- 
Brian King
Linux on Power Virtualization
IBM Linux Technology Center


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ