lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <7668b2ad75e044d38b17f6ce32845dfb@darkmatter.ae>
Date:   Sun, 3 Feb 2019 05:54:51 +0000
From:   Keun-O Park <Keun-O.Park@...kmatter.ae>
To:     "maninder1.s@...sung.com" <maninder1.s@...sung.com>,
        "gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
        "peter@...leysoftware.com" <peter@...leysoftware.com>,
        "jslaby@...e.com" <jslaby@...e.com>
CC:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "AMIT SAHRAWAT" <a.sahrawat@...sung.com>,
        Vaneet Narang <v.narang@...sung.com>,
        Rohit Thapliyal <r.thapliyal@...sung.com>,
        Ayush Mittal <ayush.m@...sung.com>
Subject: Re: race between flush_to_ldisc and pty_cleanup

Hi,

02/01/2019 07:31 PM에 Maninder Singh 이(가) 쓴 글:
> Hi,
>
>> On Fri, Feb 01, 2019 at 07:03:26PM +0530, Maninder Singh wrote:
>>> Hi,
>>>
>>>
>>> There is some race condition between tty_port_put and flush_to_ldisc
>>> which lead to use after free case:
>>> (Kernel 4.1)
>>>
>>> [1403.5130] Unable to handle kernel paging request at virtual address 6b6b6b83
>>> ...
>>> ...
>>> ...
>>>
>>> [1403.5132] [<c036a65c>] (ldsem_down_read_trylock) from [<c0367db4>] (tty_ldisc_ref+0x24/0x60)
>>> [1403.5132] [<c0367d90>] (tty_ldisc_ref) from [<c03690cc>] (flush_to_ldisc+0x6c/0x21c)
>>> [1403.5132]  r5:dbcd4a84 r4:00000000
>>> [1403.5132] [<c0369060>] (flush_to_ldisc) from [<c0049f18>] (process_one_work+0x214/0x570)
>>> [1403.5132]  r10:00000000 r9:ddab0000 r8:e3d6e000 r7:00000000 r6:e453f740 r5:cb37b780
>>> [1403.5132]  r4:dbcd4a84
>>> [1403.5132] [<c0049d04>] (process_one_work) from [<c004a2d4>] (worker_thread+0x60/0x580)
>>> [1403.5132]  r10:e453f740 r9:ddab0000 r8:e453f764 r7:00000088 r6:e453f740 r5:cb37b798
>>> [1403.5132]  r4:cb37b780
>>> [1403.5132] [<c004a274>] (worker_thread) from [<c005022c>] (kthread+0xec/0x104)
>>> [1403.5132]  r10:00000000 r9:00000000 r8:00000000 r7:c004a274 r6:cb37b780 r5:d8a3fc80
>>> [1403.5132]  r4:00000000
>>> [1403.5132] [<c0050140>] (kthread) from [<c00107d8>] (ret_from_fork+0x14/0x3c)
>>>
>>>
>>> for checking further we entered some debug prints and added delay in flush_to_ldisc to reproduce
>>> and seems there is some issue with workqueue implementation of TTY:
>>>
>>> bool tty_buffer_cancel_work(struct tty_port *port)
>>> {
>>>         bool ret;
>>>         ret = cancel_work_sync(&port->buf.work); // Check return value of cancel_work_sync
>>>         pr_emerg("Work cancelled is 0x%x %pS %d\n", (unsigned int)&port->buf.work, (void *)_RET_IP_, ret);
>>>         return ret;
>>> }
>>>
>>> static void flush_to_ldisc(struct work_struct *work)
>>> {
>>> ...
>>>         mdelay(100);   // Added Delay to reproduce race
>>>
>>>         if (flag_work_cancel) {
>>>                 pr_emerg("scheduled work after stopping work %x\n", (unsigned int)work);
>>>
>>> ....
>>> }
>>>
>>> static void pty_cleanup(struct tty_struct *tty)
>>> {
>>> ...
>>>         flag_work_cancel = 1;
>>>                 ...
>>> }
>>>
>>>
>>> [1403.4158]Work cancelled is dbcd4a84 tty_port_destroy+0x1c/0x6c  0       // Since return is 0 so no work is pending
>>>
>>> [1403.5129] scheduled work after stopping work dbcd4a84    // Still same work is scheduled after cancelled
>>> [1403.5130] Unable to handle kernel paging request at virtual address 6b6b6b83   // Kernel OOPs occured because of use after free
>> Ok, after my initial "use a newer kernel" comment, this really does look
>> strange.  There has also been a lot of workqueue fixes and rework since
>> 4.1, and that might be the thing that fixes this issue here.
>>
>> However, are you sure you are not just calling flush_to_ldisc() directly
>> through some codepath somehow?  If you look at the stack in the
> Yes, there is no call path for flush_to_disc directly. It is all aligned with kernel 4.1.
>
>
>> pr_emerg() message, where did it come from?  From the same workqueue
>> that you already stopped?
> We added debug prints to check "work" in pty_cleanup() & flush_to_ldisc
>
>> Testing on a newer kernel would be great, if possible.
> We are facing it hard, but currently we have 4.1 and able to reproduce on that.
> Not really possible to have the latest kernel on the same target and may be reproduce the same race.
> Tried to track for changes in the other stable branches, but no change looks really relevant for this race.
> I might be wrong, please help if there is any commit related with this.
Please find & review the attached patch for this issue.

I think the tty which called pty_cleanup may be different from the tty
which called flush_to_ldisc.
Please check if those two are same or different. I don't think this is a
workqueue problem.
Thanks.

Best Regards,
Sahara

>
>> thanks,
>>
>> greg k-h
> Our initial debugging direction was with "tty" but looks some issue in workqueue.
> Also, the most surprising looks to be the CANCEL and FLUSH to be occuring on the same CORE in sequence.
> So, if the CANCEL really worked how can the flush_to_ldisc be scheduled.
>
> Thanks,
> Maninder Singh
>



Keun-O Park | Principal Software Engineer
Tel: +971 2 417 1417 | Mob: +971 55 260 7343
Keun-O.Park@...kmatter.ae

The information transmitted, including attachments, is intended only for the person(s) or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and destroy any copies of this information.









View attachment "0001-tty-pty-Fix-race-condition-between-release_one_tty-a.patch" of type "text/x-patch" (2171 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ