lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 14 Aug 2023 17:38:55 +0200
From:   David Hildenbrand <david@...hat.com>
To:     Mateusz Guzik <mjguzik@...il.com>
Cc:     linux-kernel@...r.kernel.org, torvalds@...ux-foundation.org,
        brauner@...nel.org, ebiederm@...ssion.com,
        akpm@...ux-foundation.org, linux-mm@...ck.org, koct9i@...il.com,
        oleg@...hat.com, dave@...olabs.net
Subject: Re: [PATCH] kernel/fork: stop playing lockless games for exe_file
 replacement

On 14.08.23 10:54, Mateusz Guzik wrote:
> On 8/14/23, David Hildenbrand <david@...hat.com> wrote:
>> On 14.08.23 10:21, Mateusz Guzik wrote:
>>> On 8/14/23, David Hildenbrand <david@...hat.com> wrote:
>>>> On 13.08.23 14:33, Mateusz Guzik wrote:
>>>>> xchg originated in 6e399cd144d8 ("prctl: avoid using mmap_sem for
>>>>> exe_file serialization"). While the commit message does not explain
>>>>> *why* the change, clearly the intent was to use mmap_sem less in this
>>>>> codepath. I found the original submission [1] which ultimately claims
>>>>> it
>>>>> cleans things up.
>>>>
>>>> More details are apparently in v1 of that patch:
>>>>
>>>> https://lore.kernel.org/lkml/1424979417.10344.14.camel@stgolabs.net/
>>>>
>>>> Regarding your patch: adding more mmap_write_lock() where avoidable, I'm
>>>> not so sure.
>>>>
>>>
>>> But exe_file access is already synchronized with the semaphore and
>>> your own commit added a mmap_read_lock/mmap_read_unlock cycle after
>>> the xchg in question to accomodate this requirement.
>>
>> Yes, we want to handle concurrent fork() ("Don't race with dup_mmap()"),
>> thus mmap_read_lock.
>>
>>> Then mmap_write_lock around the replacement is the obvious thing to do.
>>
>> Apparently to you. :)
>>
>> mmap_write_lock will block more than fork. For example, concurrent page
>> faults (without new VMA locking), for no apparent reason to me.
>>
>>>
>>>> Your patch doesn't look (to me) like it is removing a lot of complexity.
>>>>
>>>
>>> The code in the current form makes the reader ask what prompts xchg +
>>> read-lock instead of mere write-locking.
>>>
>>> This is not a hot path either and afaics it can only cause contention
>>> if userspace is trying to abuse the interface to break the kernel,
>>> messing with a processs hard at work (which would be an extra argument
>>> to not play games on kernel side).
>>>
>>> That said, no, it does not remove "a lot of complexity". It does
>>> remove some though at no real downside that I can see.
>>>
>>> But then again, it is on people who insist on xchg to justify it.
>>
>> Changing it now needs good justification, why we would want to block any
>> concurrent MM activity. Maybe there is good justification.
>>
>> In any case, this commit would have to update the documentation of
>> replace_mm_exe_file, that spells out existing locking behavior.
>>
> 
> Perhaps it will help if I add that the prctl thingy always had a
> troubled relationship with locking.

Yes, it's not the first time that I looked at kernel/sys.c and wodnered 
why some of the toggles don't perform any locking.

> 
> Last time I looked at it was in 2016, where I found that it was doing
> down_read to update arg_start/arg_end and others while a consumer in
> procfs would read them and assert on their sanity. As only a read-lock
> was held, 2 threads could be used to transiently produce a bogus state
> as they race with their updates and trigger the BUG. See this commit
> (but also excuse weirdly bad english ;))
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ddf1d398e517e660207e2c807f76a90df543a217
> 
> Moreover check out the following in prctl_set_auxv:
> 
>          task_lock(current);
>          memcpy(mm->saved_auxv, user_auxv, len);
>          task_unlock(current);
> 
> any thread in the process can reach that codepath while sharing the
> same mm, thus this does not lock any updates. Not only that, but a
> duplicated memcpy onto the area in prctl_set_mm_map does not even take
> that lock and the code to read this does not take any locks.
> 
> [Code duplication and synchronization aside, additional points
> deducted for saved_auxv storing always-NULL pointers instead of adding
> them on reads.]
> 
> The above exhausts my willingness to argue about this change, I'm just
> a passerby. If it is NAKed, I'm dropping the subject.

As long as nobody cares about concurrent MM activity being restricted 
with your change (I suspect we don't care), we're good.

-- 
Cheers,

David / dhildenb

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ