lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 6 Mar 2019 21:07:00 +0800
From:   zhong jiang <zhongjiang@...wei.com>
To:     Peter Xu <peterx@...hat.com>, Mike Rapoport <rppt@...ux.ibm.com>,
        "Andrea Arcangeli" <aarcange@...hat.com>
CC:     Dmitry Vyukov <dvyukov@...gle.com>,
        syzbot <syzbot+cbb52e396df3e565ab02@...kaller.appspotmail.com>,
        Michal Hocko <mhocko@...nel.org>, <cgroups@...r.kernel.org>,
        Johannes Weiner <hannes@...xchg.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Linux-MM <linux-mm@...ck.org>,
        syzkaller-bugs <syzkaller-bugs@...glegroups.com>,
        Vladimir Davydov <vdavydov.dev@...il.com>,
        David Rientjes <rientjes@...gle.com>,
        Hugh Dickins <hughd@...gle.com>,
        Matthew Wilcox <willy@...radead.org>,
        Mel Gorman <mgorman@...e.de>, Vlastimil Babka <vbabka@...e.cz>,
        Mike Rapoport <rppt@...ux.vnet.ibm.com>
Subject: Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm

On 2019/3/6 16:12, Peter Xu wrote:
> On Wed, Mar 06, 2019 at 03:41:06PM +0800, zhong jiang wrote:
>> On 2019/3/6 14:26, Mike Rapoport wrote:
>>> Hi,
>>>
>>> On Wed, Mar 06, 2019 at 01:53:12PM +0800, zhong jiang wrote:
>>>> On 2019/3/6 10:05, Andrea Arcangeli wrote:
>>>>> Hello everyone,
>>>>>
>>>>> [ CC'ed Mike and Peter ]
>>>>>
>>>>> On Tue, Mar 05, 2019 at 02:42:00PM +0800, zhong jiang wrote:
>>>>>> On 2019/3/5 14:26, Dmitry Vyukov wrote:
>>>>>>> On Mon, Mar 4, 2019 at 4:32 PM zhong jiang <zhongjiang@...wei.com> wrote:
>>>>>>>> On 2019/3/4 22:11, Dmitry Vyukov wrote:
>>>>>>>>> On Mon, Mar 4, 2019 at 3:00 PM zhong jiang <zhongjiang@...wei.com> wrote:
>>>>>>>>>> On 2019/3/4 15:40, Dmitry Vyukov wrote:
>>>>>>>>>>> On Sun, Mar 3, 2019 at 5:19 PM zhong jiang <zhongjiang@...wei.com> wrote:
>>>>>>>>>>>> Hi, guys
>>>>>>>>>>>>
>>>>>>>>>>>> I also hit the following issue. but it fails to reproduce the issue by the log.
>>>>>>>>>>>>
>>>>>>>>>>>> it seems to the case that we access the mm->owner and deference it will result in the UAF.
>>>>>>>>>>>> But it should not be possible that we specify the incomplete process to be the mm->owner.
>>>>>>>>>>>>
>>>>>>>>>>>> Any thoughts?
>>>>>>>>>>> FWIW syzbot was able to reproduce this with this reproducer.
>>>>>>>>>>> This looks like a very subtle race (threaded reproducer that runs
>>>>>>>>>>> repeatedly in multiple processes), so most likely we are looking for
>>>>>>>>>>> something like few instructions inconsistency window.
>>>>>>>>>>>
>>>>>>>>>> I has a little doubtful about the instrustions inconsistency window.
>>>>>>>>>>
>>>>>>>>>> I guess that you mean some smb barriers should be taken into account.:-)
>>>>>>>>>>
>>>>>>>>>> Because IMO, It should not be the lock case to result in the issue.
>>>>>>>>> Since the crash was triggered on x86 _most likley_ this is not a
>>>>>>>>> missed barrier. What I meant is that one thread needs to executed some
>>>>>>>>> code, while another thread is stopped within few instructions.
>>>>>>>>>
>>>>>>>>>
>>>>>>>> It is weird and I can not find any relationship you had said with the issue.:-(
>>>>>>>>
>>>>>>>> Because It is the cause that mm->owner has been freed, whereas we still deference it.
>>>>>>>>
>>>>>>>> From the lastest freed task call trace, It fails to create process.
>>>>>>>>
>>>>>>>> Am I miss something or I misunderstand your meaning. Please correct me.
>>>>>>> Your analysis looks correct. I am just saying that the root cause of
>>>>>>> this use-after-free seems to be a race condition.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> Yep, Indeed,  I can not figure out how the race works. I will dig up further.
>>>>> Yes it's a race condition.
>>>>>
>>>>> We were aware about the non-cooperative fork userfaultfd feature
>>>>> creating userfaultfd file descriptor that gets reported to the parent
>>>>> uffd, despite they belong to mm created by failed forks.
>>>>>
>>>>> https://www.spinics.net/lists/linux-mm/msg136357.html
>>>>>
>>>> Hi, Andrea
>>>>
>>>> I still not clear why uffd ioctl can use the incomplete process as the mm->owner.
>>>> and how to produce the race.
>>> There is a C reproducer in  the syzcaller report:
>>>
>>> https://syzkaller.appspot.com/x/repro.c?x=172fa5a3400000
>>>  
>>>> From your above explainations,   My underdtanding is that the process handling do_exexve
>>>> will have a temporary mm,  which will be used by the UUFD ioctl.
>>> The race is between userfaultfd operation and fork() failure:
>>>
>>> forking thread                  | userfaultfd monitor thread
>>> --------------------------------+-------------------------------
>>> fork()                          |
>>>   dup_mmap()                    |
>>>     dup_userfaultfd()           |
>>>     dup_userfaultfd_complete()  |
>>>                                 |  read(UFFD_EVENT_FORK)
>>>                                 |  uffdio_copy()
>>>                                 |    mmget_not_zero()
>>>     goto bad_fork_something     |
>>>     ...                         |
>>> bad_fork_free:                  |
>>>       free_task()               |
>>>                                 |  mem_cgroup_from_task()
>>>                                 |       /* access stale mm->owner */
>>>  
>> Hi, Mike
> Hi, Zhong,
>
>> forking thread fails to create the process ,and then free the allocated task struct.
>> Other userfaultfd monitor thread should not access the stale mm->owner.
>>
>> The parent process and child process do not share the mm struct.  Userfaultfd monitor thread's
>> mm->owner should not point to the freed child task_struct.
> IIUC the problem is that above mm (of the mm->owner) is the child
> process's mm rather than the uffd monitor's.  When
> dup_userfaultfd_complete() is called there will be a new userfaultfd
> context sent to the uffd monitor thread which linked to the chlid
> process's mm, and if the monitor thread do UFFDIO_COPY upon the newly
> received userfaultfd it'll operate on that new mm too.
Thank Mike and Peter for further explanation. I get it.

Yes, The race indeed will result in the issue.

but as for the patch Andrea has posted. I still has a little worry.

The patch use call_rcu to delay free the task_struct, but It is possible to free the task_struct
ahead of get_mem_cgroup_from_mm. is it right?

Thanks,
zhong jiang
>> and due to the existence of tasklist_lock,  we can not specify the mm->owner to freed task_struct.
>>
>> I miss something,=-O
>>
>> Thanks,
>> zhong jiang
>>>> Thanks,
>>>> zhong jiang
>>
> Regards,
>


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ