linux-kernel - Re: [PATCH] fs: proc: use down_read_killable in proc_pid_cmdline

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <139a6862-ca1a-c291-3e03-8130e35b5fc0@linux.alibaba.com>
Date:   Fri, 23 Feb 2018 12:08:20 -0800
From:   Yang Shi <yang.shi@...ux.alibaba.com>
To:     Alexey Dobriyan <adobriyan@...il.com>
Cc:     akpm@...ux-foundation.org, mingo@...nel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] fs: proc: use down_read_killable in
 proc_pid_cmdline_read()



On 2/23/18 11:45 AM, Alexey Dobriyan wrote:
> On Fri, Feb 23, 2018 at 11:42:34AM -0800, Yang Shi wrote:
>>
>> On 2/23/18 11:33 AM, Alexey Dobriyan wrote:
>>> On Wed, Feb 21, 2018 at 03:13:10PM -0800, Yang Shi wrote:
>>>
>>>>>>> 2) access_remote_vm() et al will do the same ->mmap_sem, and
>>>>>> Yes, it does. But, __access_remote_vm() is called by access_process_vm()
>>>>>> too, which is used by much more places, i.e. ptrace, so I was not sure
>>>>>> if it is preferred to convert to killable version. So, I leave it untouched.
>>>>> Yeah, but ->mmap_sem is taken 3 times per /proc/*/cmdline read
>>>>> and your scalability tests should trigger next backtrace right away.
>>>> Yes, however, I didn't run into it if mmap_sem is acquired earlier.
>>>>
>>>> How about defining a killable version, like
>>>> __access_remote_vm_killable() which use down_read_killable(), then the
>>>> killable version can be used by proc/*/cmdline? There might be other
>>>> users in the future.
>>> It would be a disaster as interfaces multiply.
>> Might be not that bad.
> Maybe.
>
> But you need to explain why there is no backtrace several lines later:
>
> 	access_remote_vm
> 	__access_remote_vm
> 	down_read(&mm->mmap_sem)

I think it might be because:

         CPU A                                  CPU B
                                          read /proc/*/cmdline
                                          get_mm
acquire mmap_sem
munmap(300G)                 try to acquire mmap_sem --> go to sleep
release mmap_sem
                                           got mmap_sem
                                           release mmap_sem

                                           access_remote_vm
                                           put_mm


The munmap might happen right before access_remote_vm(), but I just 
didn't run into it for the time being. It may be hit on another machine 
or with some changes to the test cases.

BTW, even the hung I met happened occassionally, not very often. So, the 
access_remote_vm() hit sounds less often. But, I agree it is still 
possible in theory.

Regards,
Yang