lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 11 Aug 2011 09:33:47 -0500
From:	James Bottomley <James.Bottomley@...senPartnership.com>
To:	Jun'ichi Nomura <j-nomura@...jp.nec.com>
Cc:	jaxboe@...ionio.com, roland@...estorage.com,
	stern@...land.harvard.edu, linux-scsi@...r.kernel.org,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	device-mapper development <dm-devel@...hat.com>,
	Kiyoshi Ueda <k-ueda@...jp.nec.com>
Subject: Re: [BUG] Oops when SCSI device under multipath is removed

On Thu, 2011-08-11 at 12:01 +0900, Jun'ichi Nomura wrote:
> Hi James,
> 
> On 08/11/11 09:24, Jun'ichi Nomura wrote:
> > On 08/11/11 04:52, James Bottomley wrote:
> >> On Wed, 2011-08-10 at 13:29 +0900, Jun'ichi Nomura wrote:
> >>>   2) SCSI to call blk_cleanup_queue() from device's ->release() callback
> >>>      (before 2.6.39, it used to work like this)
> >>>      https://lkml.org/lkml/2011/7/2/106
> >>
> >> Well, they both have documented objections.  I asked why we destroy the
> >> elevator in the del case and didn't get any traction, so let me show the
> >> actual patch which should fix all of these issues.
> >>
> >> Is there a good reason for not doing this as a bug fix now?
> ...
> > I think it doesn't work because elevator_exit() and
> > blk_throtl_exit() take &q->queue_lock, which may be freed
> > by LLD after blk_cleanup_queue, before blk_release_queue.
> 
> If the reason you moved scsi_free_queue into scsi_remove_device
> is marking the queue dead, how about the following patch?
> Do you think it's acceptable?

Well, it's just hiding the problem.  The essential problem is that only
block has the correctly refcounted knowledge to know the last release of
the queue reference.  Until that time, the holder of the reference can
use the queue regardless of whether blk_cleanup_queue() has been called.
This is the race you complain about since use of the queue involves the
lock which should be guarded by QUEUE_DEAD checks.

This is essentially unfixable with function calls.  The only way to fix
it is to have a callback model for freeing the external lock.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ