lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <52099E1B.5000203@oracle.com>
Date:	Tue, 13 Aug 2013 10:46:51 +0800
From:	vaughan <vaughan.cao@...cle.com>
To:	dgilbert@...erlog.com
CC:	Jörn Engel <joern@...fs.org>,
	JBottomley@...allels.com, linux-scsi@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH v5 0/4] [SCSI] sg: fix race condition in sg_open

On 08/06/2013 04:52 AM, Douglas Gilbert wrote:
> On 13-08-04 10:19 PM, vaughan wrote:
>> On 08/03/2013 01:25 PM, Douglas Gilbert wrote:
>>> On 13-08-01 01:01 AM, Douglas Gilbert wrote:
>>>> On 13-07-22 01:03 PM, Jörn Engel wrote:
>>>>> On Mon, 22 July 2013 12:40:29 +0800, Vaughan Cao wrote:
>>>>>>
>>>>>> There is a race when open sg with O_EXCL flag. Also a race may
>>>>>> happen between
>>>>>> sg_open and sg_remove.
>>>>>>
>>>>>> Changes from v4:
>>>>>>    * [3/4] use ERR_PTR series instead of adding another parameter in
>>>>>> sg_add_sfp
>>>>>>    * [4/4] fix conflict for cherry-pick from v3.
>>>>>>
>>>>>> Changes from v3:
>>>>>>    * release o_sem in sg_release(), not in sg_remove_sfp().
>>>>>>    * not set exclude with sfd_lock held.
>>>>>>
>>>>>> Vaughan Cao (4):
>>>>>>     [SCSI] sg: use rwsem to solve race during exclusive open
>>>>>>     [SCSI] sg: no need sg_open_exclusive_lock
>>>>>>     [SCSI] sg: checking sdp->detached isn't protected when open
>>>>>>     [SCSI] sg: push file descriptor list locking down to per-device
>>>>>>       locking
>>>>>>
>>>>>>    drivers/scsi/sg.c | 178
>>>>>> +++++++++++++++++++++++++-----------------------------
>>>>>>    1 file changed, 83 insertions(+), 95 deletions(-)
>>>>>
>>>>> Patchset looks good to me, although I didn't test it on hardware yet.
>>>>> Signed-off-by: Joern Engel <joern@...fs.org>
>>>>>
>>>>> James, care to pick this up?
>>>>
>>>> Acked-by: Douglas Gilbert <dgilbert@...erlog.com>
>>>>
>>>> Tested O_EXCL with multiple processes and threads; passed.
>>>> sg driver prior to this patch had "leaky" O_EXCL logic
>>>> according to the same test. Block device passed.
>>>>
>>>> James, could you clean this up:
>>>>     drivers/scsi/sg.c:242:6: warning: unused variable ‘res’
>>>> [-Wunused-variable]
>>>
>>> Further testing suggests this patch on the sg driver is
>>> broken, so I'll rescind my ack.
>>>
>>> The case it is broken for is when a device is opened
>>> without O_EXCL. Now if, while it is open, a second
>>> thread/process tries to open the same device O_EXCL
>>> then IMO the second open should fail with EBUSY.
>>>
>>> My testing shows that O_EXCL opens properly deflect
>>> other O_EXCL opens.
>> Hi  Doug,
>>
>> My test don't have this issue. The routine is something as below:
>>
>> I start three opens without O_EXCL, wait 30s each, and open with
>> O_EXCL|O_NONBLOCK, it failed with EBUSY.
>> And I also call myopen with/without O_EXCL many times in background at
>> the same time, and the test is passed. I don't know why it failed in
>> your test.
>>
>> Usage: myopen [-e][-n][-d delay] -f file
>>        -e: exclude
>>        -n: nonblock
>>        -d: delay N seconds and then close.
>>
>> [root@...aowol5 16835013]# ./myopen  -f /dev/sg5 -d 30 &
>> [1] 3417
>> [root@...aowol5 16835013]# ./myopen  -f /dev/sg5 -d 30 &
>> [2] 3418
>> [root@...aowol5 16835013]# ./myopen  -f /dev/sg5 -d 30 &
>> [3] 3419
>> [root@...aowol5 16835013]# cat /proc/scsi/sg/debug
>> max_active_device=6(origin 1)
>>   def_reserved_size=32768
>>   >>> device=sg5 scsi5 chan=0 id=1 lun=0   em=0 sg_tablesize=55 excl=0
>>     FD(1): timeout=60000ms bufflen=32768 (res)sgat=1 low_dma=0
>>     cmd_q=0 f_packid=0 k_orphan=0 closed=0
>>       No requests active
>>     FD(2): timeout=60000ms bufflen=32768 (res)sgat=1 low_dma=0
>>     cmd_q=0 f_packid=0 k_orphan=0 closed=0
>>       No requests active
>>     FD(3): timeout=60000ms bufflen=32768 (res)sgat=1 low_dma=0
>>     cmd_q=0 f_packid=0 k_orphan=0 closed=0
>>       No requests active
>>
>> [root@...aowol5 16835013]# ./myopen -e -n  -f /dev/sg5 -d 30 &
>> [4] 3422
>> [3422:3351] /dev/sg5:exclude: Device or resource busy
>>
>> [4]+  Exit 1                  ./myopen -e -n -f /dev/sg5 -d 30
>>
>> [root@...aowol5 16835013]# cat /proc/scsi/sg/debug
>> max_active_device=6(origin 1)
>>   def_reserved_size=32768
>>   >>> device=sg5 scsi5 chan=0 id=1 lun=0   em=0 sg_tablesize=55 excl=0
>>     FD(1): timeout=60000ms bufflen=32768 (res)sgat=1 low_dma=0
>>     cmd_q=0 f_packid=0 k_orphan=0 closed=0
>>       No requests active
>>     FD(2): timeout=60000ms bufflen=32768 (res)sgat=1 low_dma=0
>>     cmd_q=0 f_packid=0 k_orphan=0 closed=0
>>       No requests active
>>     FD(3): timeout=60000ms bufflen=32768 (res)sgat=1 low_dma=0
>>     cmd_q=0 f_packid=0 k_orphan=0 closed=0
>>       No requests active
>> [root@...aowol5 16835013]# cat /proc/scsi/sg/debug
>> [1]   Done                    ./myopen -f /dev/sg5 -d 30
>> [2]-  Done                    ./myopen -f /dev/sg5 -d 30
>> [3]+  Done                    ./myopen -f /dev/sg5 -d 30
>>
>
> Hi,
> After the initial failures about 36 hours ago, retesting
> yesterday and today has not produced any unexpected
> failures. And I have been trying hard on lk 3.10.4 and
> lk 3.10.5 .
>
> My test program is a bit more intense than yours and can
> be found in the sg3_utils beta in the News section of this
> page:
>   http://sg.danny.cz/sg/
>
> It is in the examples directory, two variants called
> sg_tst_excl and sg_tst_excl2 . You will need a recent gcc
> compiler, IOW something that can compile c++11 . gcc 4.7.3
> in Ubuntu 13.04 only just manages, fedora 19 should do
> better with gcc 4.8.1 . The threading is implemented using
> pthreads so it should be reliable.
>
> Typically I run multiple instances (processes) and each has
> multiple threads. One instance can run '-x' which will cause
> its first thread not to use O_EXCL **. All my tests currently
> use O_NONBLOCK and that leads to lots of EBUSYs (sometimes
> in the billions).
>
> Doug Gilbert
>
>
> ** Using '-x' on two instances will cause an expected failure
>    so can be used as a control.
>
Hi Doug,

Can I regard this as you ACK it again?

Vaughan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ