linux-kernel - Re: [ANNOUNCE] Status of unlocked

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1288134048.5169.132.camel@haakon2.linux-iscsi.org>
Date:	Tue, 26 Oct 2010 16:00:48 -0700
From:	"Nicholas A. Bellinger" <nab@...ux-iscsi.org>
To:	James Bottomley <James.Bottomley@...e.de>
Cc:	Mike Anderson <andmike@...ux.vnet.ibm.com>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	linux-scsi <linux-scsi@...r.kernel.org>,
	Vasu Dev <vasu.dev@...ux.intel.com>,
	Tim Chen <tim.c.chen@...ux.intel.com>,
	Andi Kleen <ak@...ux.intel.com>,
	Matthew Wilcox <willy@...ux.intel.com>,
	Mike Christie <michaelc@...wisc.edu>,
	Jens Axboe <jaxboe@...ionio.com>,
	James Smart <james.smart@...lex.com>,
	Andrew Vasquez <andrew.vasquez@...gic.com>,
	FUJITA Tomonori <fujita.tomonori@....ntt.co.jp>,
	Hannes Reinecke <hare@...e.de>,
	Joe Eykholt <jeykholt@...co.com>,
	Christoph Hellwig <hch@....de>,
	Jon Hawley <warthog9@...nel.org>,
	Brian King <brking@...ux.vnet.ibm.com>,
	Christof Schmitt <christof.schmitt@...ibm.com>,
	Tejun Heo <tj@...nel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	"H. Peter Anvin" <hpa@...or.com>
Subject: Re: [ANNOUNCE] Status of unlocked_qcmds=1 operation for .37

On Tue, 2010-10-26 at 22:50 +0000, James Bottomley wrote:
> On Tue, 2010-10-26 at 15:34 -0700, Nicholas A. Bellinger wrote:
> > On Tue, 2010-10-26 at 17:27 -0500, James Bottomley wrote:
> > > On Tue, 2010-10-26 at 15:08 -0700, Nicholas A. Bellinger wrote:
> > > > [PATCH] scsi: Add SCSI_EH_SOFTIRQ_DONE usage
> > > > 
> > > > This patch introduces a SCSI_EH_SOFTIRQ_DONE flag that is set in scsi_softirq_done()
> > > > from block soft_irq context that is used to signal when scsi_try_to_abort_cmd() should
> > > > be calling __scsi_try_to_abort_cmd() for a timed out struct scsi_cmnd instead of
> > > > returning SUCCESS via checking only blk_test_rq_complete().  This is done because
> > > > blk_rq_timed_out_timer() calls blk_mark_rq_complete() before blk_rq_timed_out() ->
> > > > struct request_queue->rq_timed_out_fn().
> > > 
> > > This is getting pretty far off into the weeds.  I think the first step
> > > should be queue lock push down into ->queuecommand.  This would still
> > > necessitate locking around the serial number.
> > 
> > Hmmm, I am not sure I understand what you mean here.  My understanding
> > is that the whole point of the series was to remove any locking around
> > the serial number in order to make scsi_dispatch_cmd() lockless,
> > right..?
> 
> The point is that several things have to happen for that to be a
> reality.  The easiest and most obvious thing is lock push down in
> ->queuecommand.
> 

Ok, can you please explain to me what you mean here..?  As I really
thought we came to consensus that:

*) running in unlocked_qcmd=1 was to be made the default for LLDs using
the legacy optimization of ->queuecommand() -> unlock() ->
do_some_lld_work() -> lock() -> return to scsi_dispatch_cmd()

*) For the mpt-fusion / mpt2sas drivers which did not use the legacy
optimization, but Tim Chen has tested with the former and I am awaiting
an ACK from LSI on the latter.

*) All other drivers will function with host_lock held (eg: legacy mode)
in scsi_dispatch_cmd().

*) All drivers using cmd->serial_number for anything beyond an
informational purpose converted to use scsi_cmd_get_serial().

> The next is most likely serial number elimination.
> 
> But the point is that we don't have to do the whole thing all at once
> (and spend months trying to get the series right).

Not exactly correct, we have the whole thing ready right now with libfc
running unlocked_qcmd=0 legacy mode.

Once I verify START_STOP case in scsi_error.c, I am happy to respin a
mergeable tree from linus HEAD for you to pull ASAP.

> 
> > That is what the current patch series already does, in that it makes the
> > use of cmd->serial_number optional and requires LLDs who use this for
> > anything beyond an informational purpose to explictly call
> > scsi_cmd_get_serial(). 
> 
> OK, so this patch is a corner case where the error handler is using the
> serial number value to deduce something the block layer already knows
> (whether the command completed or not). I don't think introducing a
> substitute flag is the right way, the information should just be
> extracted properly.
> 

Well, according to andmike this is two corner cases that are a result of
the drop-host_lock-v4 series using only blk_test_rq_complete() in
scsi_try_to_abort_cmd(), which will be true because of:

blk_rq_timed_out_timer() ->  blk_mark_rq_complete()

> But arguments about this don't have to impede the lock push down.
> 

Sure, but I assume you mean the lock push down only for the legacy LLDs
that are not already internally unlocking host_lock in
SHT->queuecommand() in mainline code right..?  

--nab


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/