lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <540E1205.9060201@interlog.com>
Date:	Mon, 08 Sep 2014 16:31:01 -0400
From:	Douglas Gilbert <dgilbert@...erlog.com>
To:	Christoph Hellwig <hch@...radead.org>,
	Bart Van Assche <bvanassche@....org>
CC:	SCSI development list <linux-scsi@...r.kernel.org>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	James Bottomley <james.bottomley@...senpartnership.com>,
	Milan Broz <gmazyland@...il.com>
Subject: Re: [PATCH] scsi_debug: deadlock between completions and surprise
 module removal

On 14-09-08 11:07 AM, Christoph Hellwig wrote:
> On Mon, Sep 08, 2014 at 11:11:23AM +0200, Bart Van Assche wrote:
>> Hello Doug,
>>
>> In the scsi_debug driver scsi_remove_host() is called from inside the
>> sdebug_driver_remove() callback function. Unless I have missed something it
>> is not guaranteed that that callback function is invoked before unloading of
>> the scsi_debug driver has finished. I think most of the code in
>> sdebug_driver_remove() should be moved to sdebug_remove_adapter().
>
> I'm not sure that's right.  scsi_debug uses the driver mode in a
> slightly unusual way, and includes both the bus driver, device and
> device driver.
>
> sdebug_driver_remove is a bus method, but as we don't have driver methods
> should act very much like all other _remove callbacks.
>
> sdebug_remove_adapter is more a "bus-level" function that calls into
> the driver model to unbind devices from the driver.
>
> But we defintively shouldn't stop and free queued command before we
> fully remove the hosts.  As far as I can tell the stop_all_queued
> call can be entirely removed from the remove path, as the midlayer
> will take care of waiting for all commands to return, and the
> free_all_queued should be after all hosts are unregistered.

stop_all_queued() is doing hrtimer_cancel(), del_timer_sync()
or tasklet_kill() on all the scsi_cmnd objects that are
"in play". Unless another mechanism calls the .eh_abort_handler
entry point reliably on each "in play" command then the module
cannot be removed. That is because some timer expiry callbacks
are pending.

> Something like the (untested) patch below would do the trick.
> We'd still need Dougs patch for the EH case, though.
>
> diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
> index d19c0e3..d022c2f 100644
> --- a/drivers/scsi/scsi_debug.c
> +++ b/drivers/scsi/scsi_debug.c
> @@ -3983,14 +3983,13 @@ static void __exit scsi_debug_exit(void)
>   {
>   	int k = scsi_debug_add_host;
>
> -	stop_all_queued();
> -	free_all_queued();
>   	for (; k; k--)
>   		sdebug_remove_adapter();
>   	driver_unregister(&sdebug_driverfs_driver);
>   	bus_unregister(&pseudo_lld_bus);
>   	root_device_unregister(pseudo_primary);
>
> +	free_all_queued();
>   	if (dif_storep)
>   		vfree(dif_storep);
>
> --

The only other call to stop_all_queued() is from the
.eh_host_reset_handler entry point.

Doug Gilbert


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ