lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 27 Oct 2010 12:55:18 -0700
From:	"Nicholas A. Bellinger" <nab@...ux-iscsi.org>
To:	Mike Anderson <andmike@...ux.vnet.ibm.com>
Cc:	James Bottomley <James.Bottomley@...e.de>,
	Andi Kleen <ak@...ux.intel.com>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	linux-scsi <linux-scsi@...r.kernel.org>,
	Vasu Dev <vasu.dev@...ux.intel.com>,
	Tim Chen <tim.c.chen@...ux.intel.com>,
	Matthew Wilcox <willy@...ux.intel.com>,
	Mike Christie <michaelc@...wisc.edu>,
	Jens Axboe <jaxboe@...ionio.com>,
	James Smart <james.smart@...lex.com>,
	Andrew Vasquez <andrew.vasquez@...gic.com>,
	FUJITA Tomonori <fujita.tomonori@....ntt.co.jp>,
	Hannes Reinecke <hare@...e.de>,
	Joe Eykholt <jeykholt@...co.com>,
	Christoph Hellwig <hch@....de>,
	Jon Hawley <warthog9@...nel.org>,
	Brian King <brking@...ux.vnet.ibm.com>,
	Christof Schmitt <christof.schmitt@...ibm.com>,
	Tejun Heo <tj@...nel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	"H. Peter Anvin" <hpa@...or.com>
Subject: Re: [ANNOUNCE] Status of unlocked_qcmds=1 operation for .37

On Wed, 2010-10-27 at 12:20 -0700, Mike Anderson wrote:
> Nicholas A. Bellinger <nab@...ux-iscsi.org> wrote:
> > On Wed, 2010-10-27 at 09:27 -0500, James Bottomley wrote:
> > > On Wed, 2010-10-27 at 09:53 +0200, Andi Kleen wrote:
> > > > > This sounds like a pretty reasonable compromise that I think is slightly
> > > > > less risky for the LLDs with the ghosts and cob-webs hanging off of
> > > > > them.
> > > > 
> > > > They won't get tested either next release cycle. Essentially
> > > > near nobody uses them.
> > > > 
> > > > > 
> > > > > What do you think..?
> > > > 
> > > > Standard linux practice is to simply push the locks down. That's a pretty
> > > > mechanical operation and shouldn't be too risky
> > > > 
> > > > With some luck you could even do it with coccinelle.
> > > 
> > > Precisely ... if we can do the push down now as a mechanical
> > > transformation we can put it in the current merge window as a low risk
> > > API change.
> > 
> > I disagree that touching every single legacy LLD's SHT->queuecommand()
> > and failure paths in that code is a low rist change.
> > 
> > >   This gives us optimal exposure to the rc sequence to sort
> > > out any problems that arise (or drivers that got missed) with the lowest
> > > risk of such problems actually arising.
> > 
> > Yes, 
> > 
> > > Given the corner cases and the
> > > late arrival of fixes, the serial number changes are just too risky for
> > > the current merge window.
> > 
> > I think with andmike's testing and ACKs for the necessary scsi_error.c
> > changes this would be an acceptable risk.
> > 
> 
> Adding SCSI_EH_SOFTIRQ_DONE in scsi_softirq_done is not going to provide
> value in scsi_try_to_abort_cmd. scsi_softirq_done calls scsi_eh_scmd_add
> without the SCSI_EH_CANCEL_CMD flag set which will stop
> scsi_try_to_abort_cmd from being called.
> 
> Removing the serial_number check in scsi_try_to_abort_cmd and not
> replacing it may be the correct action as we should be relying on the
> block complete checking. That said what James has indicated about
> splitting the serial number change out seems like the lower risk approach
> at this time.
> 

Hmm, that is unfortuate..

So in this case it would make sense to drop the explict LLD usage of
scsi_cmd_get_serial(), and re-include this into scsi_dispatch_cmd() for
all LLDs and have to deal with a per scsi_host atomic_t serial_number
counter.  Anyways, I will go ahead an respin another series to follow
this logic shortly.

The other question that was mentioned in my email yesterday would be if
the clearing of a non atomic_t cmd->serial_number from
scsi_softirq_done() -> scsi_try_to_abort_cmd() is safe to begin with..?
Does this need to be converted to an atomic_t as well to present a
subtle race outside of any of the host_lock-less series of patches..?

--nab



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists