[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5de4157b-33e0-440c-8a23-0f1a30253b5b@linaro.org>
Date: Fri, 10 Jan 2025 14:14:07 -0300
From: Adhemerval Zanella Netto <adhemerval.zanella@...aro.org>
To: Florian Weimer <fweimer@...hat.com>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Cc: Peter Zijlstra <peterz@...radead.org>,
"libc-alpha@...rceware.org" <libc-alpha@...rceware.org>,
"carlos@...hat.com" <carlos@...hat.com>, Mark Rutland
<mark.rutland@....com>, linux-kernel <linux-kernel@...r.kernel.org>,
x86@...nel.org, paulmck <paulmck@...nel.org>,
Michael Jeanson <mjeanson@...icios.com>
Subject: Re: Prevent inconsistent CPU state after sequence of dlclose/dlopen
On 10/01/25 14:10, Florian Weimer wrote:
> * Mathieu Desnoyers:
>
>> On 2025-01-10 11:54, Peter Zijlstra wrote:
>>> On Fri, Jan 10, 2025 at 10:55:36AM -0500, Mathieu Desnoyers wrote:
>>>> Hi,
>>>>
>>>> I was discussing with Mark Rutland recently, and he pointed out that a
>>>> sequence of dlclose/dlopen mapping new code at the same addresses in
>>>> multithreaded environments is an issue on ARM, and possibly on Intel/AMD
>>>> with the newer TLB broadcast maintenance.
>>> What is the exact race? Should not munmap() invalidate the TLBs
>>> before
>>> it allows overlapping mmap() to complete?
>>
>> The race Mark mentioned (on ARM) is AFAIU the following scenario:
>>
>> CPU 0 CPU 1
>>
>> - dlopen()
>> - mmap PROT_EXEC @addr
>> - fetch insn @addr, CPU state expects unchanged insn.
>> - execute unrelated code
>> - dlclose(addr)
>> - munmap @addr
>> - dlopen()
>> - mmap PROT_EXEC @addr
>> - fetch new insn @addr. Incoherent CPU state.
>
> Unmapping an object while code is executing in it is undefined.
>
> We have a problem with things like pthread_atfork handlers. We can't
> use locking there because fork handlers are expected to perform ample
> locking themselves, and an extra lock around them would run into lock
> ordering issues. (We tried for unrelated reasons and saw deadlocks in
> applications.)
>
> What we can do is bump a reference counter while we run a pthread_atfork
> callback (we already associate them with DSOs) and skip the munmap part
> in dlclose if the counter is not zero. We can complete the unmapping
> after the fork handler returns (maybe in the parent only).
We can also make dlclose a no-op (like some runtimes do), although this
has other implications.
>
> There might be other callbacks besides fork handlers that have this
> problem. A similar treatment is possible for some of them, hopefully
> all of them in glibc. We cannot cover things like std::shared_ptr
> destructor calls, though. But adding more barriers won't fix those,
> either.
>
> Thanks,
> Florian
>
Powered by blists - more mailing lists