[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87ldy170x9.fsf@oldenburg.str.redhat.com>
Date: Sat, 02 Nov 2024 22:58:42 +0100
From: Florian Weimer <fweimer@...hat.com>
To: André Almeida <andrealmeid@...lia.com>
Cc: Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>, Darren Hart
<dvhart@...radead.org>, Davidlohr Bueso <dave@...olabs.net>, Arnd
Bergmann <arnd@...db.de>, sonicadvance1@...il.com,
linux-kernel@...r.kernel.org, kernel-dev@...lia.com,
linux-api@...r.kernel.org, Nathan Chancellor <nathan@...nel.org>
Subject: Re: [PATCH v2 0/3] futex: Create set_robust_list2
* André Almeida:
> 1) x86 apps can have 32bit pointers robust lists. For a x86-64 kernel
> this is not a problem, because of the compat entry point. But there's
> no such compat entry point for AArch64, so the kernel would do the
> pointer arithmetic wrongly. Is also unviable to userspace to keep
> track every addition/removal to the robust list and keep a 64bit
> version of it somewhere else to feed the kernel. Thus, the new
> interface has an option of telling the kernel if the list is filled
> with 32bit or 64bit pointers.
The size is typically different for 32-bit and 64-bit mode (12 vs 24
bytes). Why isn't this enough to disambiguate?
> 2) Apps can set just one robust list (in theory, x86-64 can set two if
> they also use the compat entry point). That means that when a x86 app
> asks FEX-Emu to call set_robust_list(), FEX have two options: to
> overwrite their own robust list pointer and make the app robust, or
> to ignore the app robust list and keep the emulator robust. The new
> interface allows for multiple robust lists per application, solving
> this.
Can't you avoid mixing emulated and general userspace code on the same
thread? On emulator threads, you have full control over the TCB.
QEMU hints towards further problems (in linux-user/syscall.c):
case TARGET_NR_set_robust_list:
case TARGET_NR_get_robust_list:
/* The ABI for supporting robust futexes has userspace pass
* the kernel a pointer to a linked list which is updated by
* userspace after the syscall; the list is walked by the kernel
* when the thread exits. Since the linked list in QEMU guest
* memory isn't a valid linked list for the host and we have
* no way to reliably intercept the thread-death event, we can't
* support these. Silently return ENOSYS so that guest userspace
* falls back to a non-robust futex implementation (which should
* be OK except in the corner case of the guest crashing while
* holding a mutex that is shared with another process via
* shared memory).
*/
return -TARGET_ENOSYS;
The glibc implementation is not really prepared for this
(__ASSUME_SET_ROBUST_LIST is defined for must architectures). But a
couple of years ago, we had a bunch of kernels that regressed robust
list support on POWER, and I think we found out only when we tested an
unrelated glibc update and saw unexpected glibc test suite failures …
Thanks,
Florian
Powered by blists - more mailing lists