[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z9C6GpaB9WvNzvJS@pavilion.home>
Date: Tue, 11 Mar 2025 23:32:58 +0100
From: Frederic Weisbecker <frederic@...nel.org>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: LKML <linux-kernel@...r.kernel.org>,
Anna-Maria Behnsen <anna-maria@...utronix.de>,
Benjamin Segall <bsegall@...gle.com>,
Eric Dumazet <edumazet@...gle.com>,
Andrey Vagin <avagin@...nvz.org>,
Pavel Tikhomirov <ptikhomirov@...tuozzo.com>,
Peter Zijlstra <peterz@...radead.org>,
Cyrill Gorcunov <gorcunov@...il.com>
Subject: Re: [patch V3a 17/18] posix-timers: Provide a mechanism to allocate
a given timer ID
Le Tue, Mar 11, 2025 at 11:07:44PM +0100, Thomas Gleixner a écrit :
> Checkpoint/Restore in Userspace (CRIU) requires to reconstruct posix timers
> with the same timer ID on restore. It uses sys_timer_create() and relies on
> the monotonic increasing timer ID provided by this syscall. It creates and
> deletes timers until the desired ID is reached. This is can loop for a long
> time, when the checkpointed process had a very sparse timer ID range.
>
> It has been debated to implement a new syscall to allow the creation of
> timers with a given timer ID, but that's tideous due to the 32/64bit compat
> issues of sigevent_t and of dubious value.
>
> The restore mechanism of CRIU creates the timers in a state where all
> threads of the restored process are held on a barrier and cannot issue
> syscalls. That means the restorer task has exclusive control.
>
> This allows to address this issue with a prctl() so that the restorer
> thread can do:
>
> if (prctl(PR_TIMER_CREATE_RESTORE_IDS, PR_TIMER_CREATE_RESTORE_IDS_ON))
> goto linear_mode;
> create_timers_with_explicit_ids();
> prctl(PR_TIMER_CREATE_RESTORE_IDS, PR_TIMER_CREATE_RESTORE_IDS_OFF);
>
> This is backwards compatible because the prctl() fails on older kernels and
> CRIU can fall back to the linear timer ID mechanism. CRIU versions which do
> not know about the prctl() just work as before.
>
> Implement the prctl() and modify timer_create() so that it copies the
> requested timer ID from userspace by utilizing the existing timer_t
> pointer, which is used to copy out the allocated timer ID on success.
>
> If the prctl() is disabled, which it is by default, timer_create() works as
> before and does not try to read from the userspace pointer.
>
> There is no problem when a broken or rogue user space application enables
> the prctl(). If the user space pointer does not contain a valid ID, then
> timer_create() fails. If the data is not initialized, but constains a
> random valid ID, timer_create() will create that random timer ID or fail if
> the ID is already given out.
>
> As CRIU must use the raw syscall to avoid manipulating the internal state
> of the restored process, this has no library dependencies and can be
> adopted by CRIU right away.
>
> Recreating two timers with IDs 1000000 and 2000000 takes 1.5 seconds with
> the create/delete method. With the prctl() it takes 3 microseconds.
>
> Signed-off-by: Thomas Gleixner <tglx@...utronix.de>
Reviewed-by: Frederic Weisbecker <frederic@...nel.org>
Powered by blists - more mailing lists