[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z9BWGsZX9CFHUXQo@localhost.localdomain>
Date: Tue, 11 Mar 2025 16:26:18 +0100
From: Frederic Weisbecker <frederic@...nel.org>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: LKML <linux-kernel@...r.kernel.org>,
Anna-Maria Behnsen <anna-maria@...utronix.de>,
Benjamin Segall <bsegall@...gle.com>,
Eric Dumazet <edumazet@...gle.com>,
Andrey Vagin <avagin@...nvz.org>,
Pavel Tikhomirov <ptikhomirov@...tuozzo.com>,
Peter Zijlstra <peterz@...radead.org>,
Cyrill Gorcunov <gorcunov@...il.com>
Subject: Re: [patch V3 16/18] posix-timers: Dont iterate /proc/$PID/timers
with sighand:: Siglock held
Le Sat, Mar 08, 2025 at 05:48:45PM +0100, Thomas Gleixner a écrit :
> The readout of /proc/$PID/timers holds sighand::siglock with interrupts
> disabled. That is required to protect against concurrent modifications of
> the task::signal::posix_timers list because the list is not RCU safe.
>
> With the conversion of the timer storage to a RCU protected hlist, this is
> not longer required.
>
> The only requirement is to protect the returned entry against a concurrent
> free, which is trivial as the timers are RCU protected.
>
> Removing the trylock of sighand::siglock is benign because the life time of
> task_struct::signal is bound to the life time of the task_struct itself.
>
> There are two scenarios where this matters:
>
> 1) The process is life and not about to be checkpointed
>
> 2) The process is stopped via ptrace for checkpointing
>
> #1 is a racy snapshot of the armed timers and nothing can rely on it. It's
> not more than debug information and it has been that way before because
> sighand lock is dropped when the buffer is full and the restart of
> the iteration might find a completely different set of timers.
>
> The task and therefore task::signal cannot be freed as timers_start()
> acquired a reference count via get_pid_task().
>
> #2 the process is stopped for checkpointing so nothing can delete or create
> timers at this point. Neither can the process exit during the traversal.
>
> If CRIU fails to observe an exit in progress prior to the dissimination
> of the timers, then there are more severe problems to solve in the CRIU
> mechanics as they can't rely on posix timers being enabled in the first
> place.
>
> Therefore replace the lock acquisition with rcu_read_lock() and switch the
> timer storage traversal over to seq_hlist_*_rcu().
>
> Signed-off-by: Thomas Gleixner <tglx@...utronix.de>
Reviewed-by: Frederic Weisbecker <frederic@...nel.org>
Powered by blists - more mailing lists