lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5404CE6A.8090402@interlog.com>
Date:	Mon, 01 Sep 2014 15:52:10 -0400
From:	Douglas Gilbert <dgilbert@...erlog.com>
To:	Christoph Hellwig <hch@...radead.org>
CC:	SCSI development list <linux-scsi@...r.kernel.org>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	James Bottomley <james.bottomley@...senpartnership.com>,
	Milan Broz <gmazyland@...il.com>
Subject: Re: [PATCH] scsi_debug: deadlock between completions and surprise
 module removal

On 14-09-01 11:36 AM, Christoph Hellwig wrote:
>> --- a/drivers/scsi/scsi_debug.c	2014-08-26 13:24:51.646948507 -0400
>> +++ b/drivers/scsi/scsi_debug.c	2014-08-30 18:04:54.589226679 -0400
>> @@ -2743,6 +2743,13 @@ static int stop_queued_cmnd(struct scsi_
>>   		if (test_bit(k, queued_in_use_bm)) {
>>   			sqcp = &queued_arr[k];
>>   			if (cmnd == sqcp->a_cmnd) {
>> +				devip = (struct sdebug_dev_info *)
>> +					cmnd->device->hostdata;
>> +				if (devip)
>> +					atomic_dec(&devip->num_in_q);
>> +				sqcp->a_cmnd = NULL;
>
> Why would the hostdata every be NULL here?  We should never
> call ->slave_destroy on a device that has outstanding commands.

To your first question, perhaps it could not be. In
the scsi_debug driver almost all uses of 'devip'
check for NULL, so it may not always have been true.

'rmmod scsi_debug' would lead to scsi_debug_exit()
being called and that called stop_all_queued() which
deadlocked with a command completion, or worse command
commencement. scsi_debug_exit() looks a bit racy: the
first thing it does is stop_all_queued() but has
anything been done to stop new commands being issued?
Later scsi_debug_exit() brings down the hosts.


This patch is primarily a bug fix for the lk 3.17 series and
the code you highlight was simply moved from being under a
lock to outside that lock. I didn't want to be too creative,
it's too easy to slip up and upset the management.


Enhancements could be sent to your drivers-for-3.18 tree. Rob
Elliott already has a few in mind, to improve performance.
Removing all 'devip' NULL checks would also improve performance,
and readability.

Doug Gilbert


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ