lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87cygutvds.fsf@oldenburg.str.redhat.com>
Date: Fri, 10 Jan 2025 18:46:07 +0100
From: Florian Weimer <fweimer@...hat.com>
To: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Cc: Peter Zijlstra <peterz@...radead.org>,  "libc-alpha@...rceware.org"
 <libc-alpha@...rceware.org>,  "carlos@...hat.com" <carlos@...hat.com>,
  Mark Rutland <mark.rutland@....com>,  linux-kernel
 <linux-kernel@...r.kernel.org>,  x86@...nel.org,  paulmck
 <paulmck@...nel.org>,  Michael Jeanson <mjeanson@...icios.com>
Subject: Re: Prevent inconsistent CPU state after sequence of dlclose/dlopen

* Mathieu Desnoyers:

> On 2025-01-10 12:10, Florian Weimer wrote:
>> * Mathieu Desnoyers:
>> 
>>> On 2025-01-10 11:54, Peter Zijlstra wrote:
>>>> On Fri, Jan 10, 2025 at 10:55:36AM -0500, Mathieu Desnoyers wrote:
>>>>> Hi,
>>>>>
>>>>> I was discussing with Mark Rutland recently, and he pointed out that a
>>>>> sequence of dlclose/dlopen mapping new code at the same addresses in
>>>>> multithreaded environments is an issue on ARM, and possibly on Intel/AMD
>>>>> with the newer TLB broadcast maintenance.
>>>> What is the exact race? Should not munmap() invalidate the TLBs
>>>> before
>>>> it allows overlapping mmap() to complete?
>>>
>>> The race Mark mentioned (on ARM) is AFAIU the following scenario:
>>>
>>> CPU 0                     CPU 1
>>>
>>> - dlopen()
>>>    - mmap PROT_EXEC @addr
>>>                            - fetch insn @addr, CPU state expects unchanged insn.
>>>                            - execute unrelated code
>>> - dlclose(addr)
>>>    - munmap @addr
>>> - dlopen()
>>>    - mmap PROT_EXEC @addr
>>>                            - fetch new insn @addr. Incoherent CPU state.
>> Unmapping an object while code is executing in it is undefined.
>
> That's not the scenario though. In this scenario, CPU 1 executes
> _unrelated code_ while we unmap @addr.

Oh, so CPU 1 initially executes some code, returns to some safe,
persistent code (“the execute unrelated code” part), this code
synchronizes with the dlclose and the dlopen that execute on CPU 0,
obtains a pointer to some supposedly safely published function in the
newly mapped object, and calls it.  And that fails because previously
cached information about the code is invalid?

Additional awkwardness may result if the initial execution is
speculative, and the code on CPU 1 only synchronizes with the dlopen,
and not the previous dlclose because it does not know about it at all?

Thanks,
Florian


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ