linux-kernel - Re: [PATCH v2] sg: O_EXCL and other lock handling

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <527542E3.7090809@interlog.com>
Date:	Sat, 02 Nov 2013 14:22:27 -0400
From:	Douglas Gilbert <dgilbert@...erlog.com>
To:	vaughan <vaughan.cao@...cle.com>,
	Christoph Hellwig <hch@...radead.org>
CC:	SCSI development list <linux-scsi@...r.kernel.org>,
	James Bottomley <james.bottomley@...senpartnership.com>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	Jörn Engel <joern@...fs.org>
Subject: Re: [PATCH v2] sg: O_EXCL and other lock handling

On 13-11-01 01:16 AM, vaughan wrote:
> On 11/01/2013 03:20 AM, Douglas Gilbert wrote:
>> On 13-10-31 11:56 AM, Christoph Hellwig wrote:
>>>> +    struct semaphore or_sem;  /* protect co-incident opens and
>>>> releases */
>>>
>>> Seems like this should be a mutex.
>>
>> Yes, it is being used as a mutex. However looking at
>> their semantics (mutex.h versus semaphore.h), a mutex
>> takes into account the task owner. If the user space
>> wants to pass around a sg file descriptor in a Unix
>> domain socket (see TLPI, Kerrisk) I don't see why the
>> sg driver should object (and pay the small performance
>> hit for each check).
>>
>> A nasty side effect of a mutex taking into account the
>> task owner is that it can't be used in interrupt
>> context. My patch does not try to do that yet (see next
>> section) but why bother. Give me a simple mutex and
>> I'll use it.
>>
>>>> sfds_list_empty(Sg_device *sdp)
>>>>    {
>>>>        unsigned long flags;
>>>>        int ret;
>>>>
>>>> +    spin_lock_irqsave(&sdp->sfd_lock, flags);
>>>> +    ret = list_empty(&sdp->sfds);
>>>> +    spin_unlock_irqrestore(&sdp->sfd_lock, flags);
>>>>        return ret;
>>>
>>> Protecting just a list_empty check with a local will give you racy
>>> results.  Seems like you should take the look over the check and the
>>> resulting action that modifies the list.  That'd also mean replacing the
>>> wait_event* calls with open coded prepare_wait / finish_wait loops.
>>
>> Not (usually) in this case. The sdp->sfds list can only
>> be expanded by another sg_open(same_dev) but this has
>> been excluded by taking down(&sdp->or_sem) prior to that
>> call. The sdp->sfds list is only normally decreased by
>> sg_release() which is also excluded by down(&sdp->or_sem).
>>
>> The abnormal case is device removal (detaching). Now an
>> open(same_dev, O_EXCL) may start waiting just after a
>> detach but miss the wake up on open_wait. That suggests
>> the wake_up(open_wait) in sg_remove() should also
>> take the sdp->or_sem semaphore.
>> Ah, and if sg_remove() can be called from an interrupt
>> context then that takes out using mutexes :-)
>>
>> The two level of locks in sg_remove() is already making me
>> uncomfortable, adding the sdp->or_sem semaphore to the
>> mix calls for more analysis.
> CCed to Jörn Engel.
>
> I feel the same regarding the many locks. After reviewing patches I
> wrote these days, I suppose the version 2
> (https://lkml.org/lkml/2013/6/17/319) may worth a re-consideration somehow.
> The purpose of or_sem is to mutex co-incident open and allow only one
> thread to be processing since it completes checking condition to really
> linking a new sfd into sfd_siblings list. My v2 re-implement a sem using
> sfd_lock and toopen. But as Jörn pointed out in the following mail, what
> I implemented is a rw_sem and that is weird and made him "having trouble
> wrapping my head around the semantics of this variable"...
> With wait_event involved, it's indeed no sense to use a rw_sem here.
> However, if we restrict toopen in [0, 1], not [0, INT_MAX] as I did in
> v2, it will act in the same way as or_sem does. Beside the same affect
> as or_sem, this implement avoids the problem using in interrupt context
> and more readable for me, rather than twisting my head among many
> spinlocks and semaphores.
> Although v2 seems attractive to me, things would be more complicated
> when detached comes in...
>
> The following is a thought I presented previously:
> Is it possible to split the sg_add_sfp() into two
> functions, sg_init_sfp() and sg_add_sfp2(). We can do all initialize
> work in sg_init_sfp()
> without any lock and let sg_add_sfp2() only serve lock-check-add in one
> place. It is less flaky.

I do not follow the last point but that is not important.


For reasons that I listed in a private post I think
that my patch presented in this thread is closer to
our goals than your patch (2013/6/17/319). Timing is
important as well since we are approaching the lk 3.13
merge window. Regressions are what will set us back.


The remaining flaws are in the detach device area
which always seems to be the last one worked on,
probably because it is uncommon and the hardest :-)
My test code hasn't broken this area with my latest
patch (but that may be just luck or not good enough
test strategies).

Also the ULDs should get some basic guarantees from
the mid-level about the attach and remove/detach
device functions. I was thinking along the lines of
"called once and only once" per removed device; won't
overlap with an attach for the same device; may be
called from an interrupt context; should not wait,
etc.

>> Also note that sdp->detached is only protected by a
>> volatile type modifier and checks on it are sprinkled
>> around the code in a rather unscientific way.
>>
>> As you are no doubt aware, handling the "surprise" device
>> removal case safely is hard, very hard. And I have tried
>> to test this, loading up 4 processes with 100 tasks each,
>> some with asynchronous queueing requests but most hanging
>> on an open_wait then remove the device. So far I have not
>> seen a hang. Again, somebody with a big machine and
>> patience might like to try a scaled up device removal test
>> to see what breaks.
>>
>>>> +    down(&sdp->or_sem);
>>>> +    alone = sfds_list_empty(sdp);
>>>> +    if ((flags & O_EXCL) && (O_RDONLY == (flags & O_ACCMODE))) {
>>>> +        retval = -EPERM; /* Don't allow O_EXCL with read only
>>>> access */
>>>> +        goto error_out;
>>>> +    }
>>>
>>> Seems like the pure flags check should move to the beginning of the
>>> function before taking any locks.
>>
>> As Vaughan pointed out, just prior to that down() is a call
>> scsi_block_when_processing_errors() for blocking open()s.
>
> Douglas,
> What Christoph points out is a sanity check for the input parameter
> flags, not the scsi_block_when_processing_errors issue.
> I guess you misread that:)

Yes, you are correct I misread the above. That test can go
to the top of sg_open() without harm and make the critical
code sections easier to follow. IMO not a show stopper.

Doug Gilbert


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/