linux-kernel - Re: [PATCH] rtmutex: ensure we wake up the top waiter

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAAq0SUmNTvdvhp+67XW+OV6N2DSDjLR5M_CFWBaAAG354kWiMg@mail.gmail.com>
Date:   Tue, 17 Jan 2023 17:01:29 -0300
From:   Wander Lairson Costa <wander@...hat.com>
To:     Waiman Long <longman@...hat.com>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>, Will Deacon <will@...nel.org>,
        Boqun Feng <boqun.feng@...il.com>,
        "open list:LOCKING PRIMITIVES" <linux-kernel@...r.kernel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Subject: Re: [PATCH] rtmutex: ensure we wake up the top waiter

On Tue, Jan 17, 2023 at 4:40 PM Waiman Long <longman@...hat.com> wrote:
>
> On 1/17/23 14:32, Waiman Long wrote:
> > On 1/17/23 12:26, Wander Lairson Costa wrote:
> >> In task_blocked_on_lock() we save the owner, release the wait_lock and
> >> call rt_mutex_adjust_prio_chain(). Before we acquire the wait_lock
> >> again, the owner may release the lock and deboost.
> > Are you referring to task_blocks_on_rt_mutex(), not
> > task_blocked_on_lock()?
> >>
> >> rt_mutex_adjust_prio_chain() acquires the wait_lock. In the requeue
> >> phase, waiter may be initially in the top of the queue, but after
> >> dequeued and requeued it may no longer be true.
> >>
> >> This scenario ends up waking the wrong task, which will verify it is no
> >> the top waiter and comes back to sleep. Now we have a situation in which
> >> no task is holding the lock but no one acquires it.
> >>
> >> We can reproduce the bug in PREEMPT_RT with stress-ng:
> >>
> >> while true; do
> >>      stress-ng --sched deadline --sched-period 1000000000 \
> >>              --sched-runtime 800000000 --sched-deadline \
> >>              1000000000 --mmapfork 23 -t 20
> >> done
> >>
> >> Signed-off-by: Wander Lairson Costa <wander@...hat.com>
> >> ---
> >>   kernel/locking/rtmutex.c | 5 +++--
> >>   1 file changed, 3 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
> >> index 010cf4e6d0b8..728f434de2bb 100644
> >> --- a/kernel/locking/rtmutex.c
> >> +++ b/kernel/locking/rtmutex.c
> >> @@ -901,8 +901,9 @@ static int __sched
> >> rt_mutex_adjust_prio_chain(struct task_struct *task,
> >>            * then we need to wake the new top waiter up to try
> >>            * to get the lock.
> >>            */
> >> -        if (prerequeue_top_waiter != rt_mutex_top_waiter(lock))
> >> -            wake_up_state(waiter->task, waiter->wake_state);
> >> +        top_waiter = rt_mutex_top_waiter(lock);
> >> +        if (prerequeue_top_waiter != top_waiter)
> >> +            wake_up_state(top_waiter->task, top_waiter->wake_state);
> >>           raw_spin_unlock_irq(&lock->wait_lock);
> >>           return 0;
> >>       }
> >
> > I would say that if a rt_mutex has no owner but have waiters, we
> > should always wake up the top waiter whether or not it is the current
> > waiter. So what is the result of the stress-ng run above? Is it a
> > hang? It is not clear from your patch description.
>
> BTW, if it is a hang. What arch has this problem? x86 or arm64? There is
> a recent report of some rt_mutex locking issue in arm64, I believe. I
> don't know if it will be related. So it is important to know in what
> arch you see this problem.
>

x86_64. Notice, at least in my test case, it only manifested under PREEMPT_RT.

> Cheers,
> Longman
>