linux-kernel - Re: [PATCH] tty: fix data races on tty

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CACT4Y+ZkSkd+-r2WY_gCbRXQpFN9Un-yvQ3cdTiO=89NfVFdbw@mail.gmail.com>
Date:	Mon, 7 Sep 2015 14:31:19 +0200
From:	Dmitry Vyukov <dvyukov@...gle.com>
To:	Peter Hurley <peter@...leysoftware.com>
Cc:	Greg Kroah-Hartman <gregkh@...uxfoundation.org>, jslaby@...e.com,
	LKML <linux-kernel@...r.kernel.org>, Jiri Slaby <jslaby@...e.cz>,
	Andrey Konovalov <andreyknvl@...gle.com>,
	Kostya Serebryany <kcc@...gle.com>,
	Alexander Potapenko <glider@...gle.com>,
	Paul McKenney <paulmck@...ux.vnet.ibm.com>,
	Hans Boehm <hboehm@...gle.com>
Subject: Re: [PATCH] tty: fix data races on tty_buffer.commit

Split this into 2 separate fixes (mailed separately).

Also dropped some READ/WRITE_ONCE. We would like to commit them as
well for the following reasons:
https://github.com/google/ktsan/wiki/READ_ONCE-and-WRITE_ONCE
What do you think about using READ/WRITE_ONCE for all concurrent
accesses to non-constant shared state in tty code?
After discovering some low-hanging data races we will proceed to
suppressing functions with "benign" data races, otherwise it becomes
unmanageable:
https://github.com/google/ktsan/blob/tsan/mm/ktsan/supp.c

Thank you







On Fri, Sep 4, 2015 at 10:13 PM, Dmitry Vyukov <dvyukov@...gle.com> wrote:
> On Fri, Sep 4, 2015 at 9:49 PM, Peter Hurley <peter@...leysoftware.com> wrote:
>> On 09/04/2015 03:37 PM, Dmitry Vyukov wrote:
>>> On Fri, Sep 4, 2015 at 9:34 PM, Peter Hurley <peter@...leysoftware.com> wrote:
>>>> Hi Dmitry,
>>>>
>>>> On 09/04/2015 03:09 PM, Dmitry Vyukov wrote:
>>>>> Race on buffer data happens in the following scenario:
>>>>> __tty_buffer_request_room does a plain write of tail->commit,
>>>>> no barriers were executed before that.
>>>>> At this point flush_to_ldisc reads this new value of commit,
>>>>> and reads buffer data, no barriers in between.
>>>>> The committed buffer data is not necessary visible to flush_to_ldisc.
>>>>
>>>> Please submit one patch for each "fix", because it is not possible
>>>> to review what you believe you're fixing.
>>>>
>>>> See below for an example.
>>>>
>>>>> Similar bug happens when tty_schedule_flip commits data.
>>>>>
>>>>> Another race happens in tty_buffer_flush. It uses plain reads
>>>>> to read tty_buffer.next, as the result it can free a buffer
>>>>> which has pending writes in __tty_buffer_request_room thread.
>>>>> For example, tty_buffer_flush calls tty_buffer_free which
>>>>> reads b->size, the size may not be visible to this thread.
>>>>> As the result a large buffer can hang in the freelist.
>>>>>
>>>>> Update commit with smp_store_release and read commit with
>>>>> smp_load_acquire, as it is commit that signals data readiness.
>>>>> This is orthogonal to the existing synchronization on tty_buffer.next,
>>>>> which is required to not dismiss a buffer with unconsumed data.
>>>>>
>>>>> The data race was found with KernelThreadSanitizer (KTSAN).
>>>>>
>>>>> Signed-off-by: Dmitry Vyukov <dvyukov@...gle.com>
>>>>> ---
>>>>>  drivers/tty/tty_buffer.c | 38 ++++++++++++++++++++++++--------------
>>>>>  1 file changed, 24 insertions(+), 14 deletions(-)
>>>>>
>>>>> diff --git a/drivers/tty/tty_buffer.c b/drivers/tty/tty_buffer.c
>>>>> index 4cf263d..4fae5d1 100644
>>>>> --- a/drivers/tty/tty_buffer.c
>>>>> +++ b/drivers/tty/tty_buffer.c
>>>>> @@ -89,7 +89,7 @@ void tty_buffer_unlock_exclusive(struct tty_port *port)
>>>>>       struct tty_bufhead *buf = &port->buf;
>>>>>       int restart;
>>>>>
>>>>> -     restart = buf->head->commit != buf->head->read;
>>>>> +     restart = READ_ONCE(buf->head->commit) != buf->head->read;
>>>>>
>>>>>       atomic_dec(&buf->priority);
>>>>>       mutex_unlock(&buf->lock);
>>>>> @@ -242,11 +242,14 @@ void tty_buffer_flush(struct tty_struct *tty, struct tty_ldisc *ld)
>>>>>       atomic_inc(&buf->priority);
>>>>>
>>>>>       mutex_lock(&buf->lock);
>>>>> -     while ((next = buf->head->next) != NULL) {
>>>>> +     /* paired with smp_store_release in __tty_buffer_request_room();
>>>>> +      * ensures there are no outstanding writes to buf->head when we free it
>>>>> +      */
>>>>> +     while ((next = smp_load_acquire(&buf->head->next)) != NULL) {
>>>>>               tty_buffer_free(port, buf->head);
>>>>>               buf->head = next;
>>>>>       }
>>>>> -     buf->head->read = buf->head->commit;
>>>>> +     buf->head->read = READ_ONCE(buf->head->commit);
>>>>>
>>>>>       if (ld && ld->ops->flush_buffer)
>>>>>               ld->ops->flush_buffer(tty);
>>>>> @@ -290,13 +293,15 @@ static int __tty_buffer_request_room(struct tty_port *port, size_t size,
>>>>>               if (n != NULL) {
>>>>>                       n->flags = flags;
>>>>>                       buf->tail = n;
>>>>> -                     b->commit = b->used;
>>>>> -                     /* paired w/ barrier in flush_to_ldisc(); ensures the
>>>>> -                      * latest commit value can be read before the head is
>>>>> -                      * advanced to the next buffer
>>>>> +                     /* paired with smp_load_acquire in flush_to_ldisc();
>>>>> +                      * ensures flush_to_ldisc() sees buffer data.
>>>>>                        */
>>>>> -                     smp_wmb();
>>>>> -                     b->next = n;
>>>>> +                     smp_store_release(&b->commit, b->used);
>>>>> +                     /* paired with smp_load_acquire in flush_to_ldisc();
>>>>> +                      * ensures the latest commit value can be read before
>>>>> +                      * the head is advanced to the next buffer
>>>>> +                      */
>>>>> +                     smp_store_release(&b->next, n);
>>>>>               } else if (change)
>>>>>                       size = 0;
>>>>>               else
>>>>> @@ -394,7 +399,10 @@ void tty_schedule_flip(struct tty_port *port)
>>>>>  {
>>>>>       struct tty_bufhead *buf = &port->buf;
>>>>>
>>>>> -     buf->tail->commit = buf->tail->used;
>>>>> +     /* paired with smp_load_acquire in flush_to_ldisc(); ensures the
>>>>> +      * committed data is visible to flush_to_ldisc()
>>>>> +      */
>>>>> +     smp_store_release(&buf->tail->commit, buf->tail->used);
>>>>>       schedule_work(&buf->work);
>>>>
>>>> schedule_work() is an implied barrier for obvious reasons.
>>>
>>> OK, I will split.
>>> To answer this particular question: you need release/write barrier
>>> _before_ the synchronizing store, not _after_. Once the store to
>>> commit happened, another thread can start reading buffer data, this
>>> thread has not yet executed schedule_work at this point.
>>
>> No.
>>
>> If the work is already running, a new work will be scheduled, and the
>> new work will pick up the changed commit index.
>>
>> If the work is already running /and it happens to see the new commit index/,
>> it will process the buffer.
>
> Problem is with this case /\/\/\
> The old work picks up the new commit, but this thread did not execute
> release barrier in between storing the data and storing the commit.
> Neither work executed acquire barrier between load of commit and load
> of data.
>
>
>> The new work will start and discover there is
>> nothing to do.
>>
>> Regards,
>> Peter Hurley
>>
>> PS - You need to base your patches on current mainline. You'll see that I already
>> converted the smp_rmb()/smp_wmb() of 'next' to load_acquire/store_release. FWIW,
>> that's not a fix, but a minor optimization. Commit sha 069f38b4983efaea9
>
> Just to make sure, you mean master branch of
> git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git ?



-- 
Dmitry Vyukov, Software Engineer, dvyukov@...gle.com
Google Germany GmbH, Dienerstraße 12, 80331, München
Geschäftsführer: Graham Law, Christine Elizabeth Flores
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Diese E-Mail ist vertraulich. Wenn Sie nicht der richtige Adressat
sind, leiten Sie diese bitte nicht weiter, informieren Sie den
Absender und löschen Sie die E-Mail und alle Anhänge. Vielen Dank.
This e-mail is confidential. If you are not the right addressee please
do not forward it, please inform the sender, and please erase this
e-mail including any attachments. Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/