[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87msieptwy.ffs@tglx>
Date: Mon, 04 Nov 2024 22:31:57 +0100
From: Thomas Gleixner <tglx@...utronix.de>
To: Frederic Weisbecker <frederic@...nel.org>
Cc: LKML <linux-kernel@...r.kernel.org>, Anna-Maria Behnsen
<anna-maria@...utronix.de>, John Stultz <jstultz@...gle.com>, Peter
Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...nel.org>, Stephen
Boyd <sboyd@...nel.org>, Eric Biederman <ebiederm@...ssion.com>, Oleg
Nesterov <oleg@...hat.com>
Subject: Re: [patch v6.1 17/20] signal: Queue ignored posixtimers on ignore
list
On Mon, Nov 04 2024 at 16:21, Thomas Gleixner wrote:
> On Mon, Nov 04 2024 at 12:42, Frederic Weisbecker wrote:
>> But there is something more problematic against the delete() path:
>>
>> Thread within Signal target Timer target
>> signal target group
>> -------------------- ------------- -------------
>> timr->it_status = POSIX_TIMER_REQUEUE_PENDING
>> posixtimer_send_sigqueue();
>> do_exit();
>> timer_delete()
>> posix_cpu_timer_del()
>> // return NULL
>> cpu_timer_task_rcu()
>> // timer->it_status NOT set
>> // to POSIX_TIMER_DISARMED
>> hlist_del(&timer->list);
>> posix_timer_cleanup_ignored()
>>
>>
>> do_sigaction(SIG_IGN...)
>> flush_sigqueue_mask()
>> sigqueue_free_ignored()
>> posixtimer_sig_ignore()
>> // Observe POSIX_TIMER_REQUEUE_PENDING
>> hlist_add_head(...ignored_posix_timers)
>> do_exit()
>> exit_itimers()
>> if (hlist_empty(&tsk->signal->posix_timers))
>> return;
>> // leaked timer queued to migrate list
>>
>
> Duh. Let me stare at that some more.
The delete() issue is actually easy to address:
posixtimer_sig_ignore() must check timer->it_signal, which is set to
NULL in timer_delete(). This write must move into the sighand lock
held section, where posix_timer_cleanup_ignored() is invoked.
If NULL, posixtimer_sig_ignore() drops the reference.
If do_sigaction() locked sighand first and moved it to the ignored
list, then posix_timer_cleanup_ignored() will remove it again.
The status part is hard to get right without sighand lock being held,
but it is required to ensure that a pending one-shot timer signal is
dropped in do_sigaction(SIG_IGN).
There is an easy fix for that too:
posixtimer_send_siqqueue() does the following under sighand lock:
timer->it_sig_periodic = timer->it_status == POSIX_TIMER_REQUEUE_PENDING;
posixtimer_sig_ignore() checks that flag. If not set it can drop the
reference independent of the actual status of the timer. If the timer
was rearmed as periodic, then it did not expire yet because the expiry
would have set the flag. If it expires concurrently the expiry
function is stuck on sighand::siglock.
If the flag is set then the signal will go onto the ignored list and
un-ignore will move it back to the pending list. That's not a problem
in the case that the timer was re/dis-armed before or after moving, as
this is all covered by the sequence check.
All of that works because in both cases the protection scheme on the
timer side is that both timer::lock and sighand::siglock have to be held
for modifying
timer::it_sigqueue_seq
timer::it_signal
timer::it_sig_periodic
which means that on the signal side holding sighand::siglock is enough.
In posixtimer_deliver_signal() holding the timer lock is sufficient to
do the sequence validation against timer::it_sig_periodic.
I'll fixup the inconsistent state thing in posix-cpu-timers too and send
out a v7 soon.
Thanks a lot for studying this in detail!
tglx
Powered by blists - more mailing lists