lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAAq0SUmYRHQLcGx2K73suEPDzpZeTU5gKeFvZb1Q+y56J5h+Tg@mail.gmail.com>
Date:   Tue, 17 Jan 2023 17:00:35 -0300
From:   Wander Lairson Costa <wander@...hat.com>
To:     Waiman Long <longman@...hat.com>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>, Will Deacon <will@...nel.org>,
        Boqun Feng <boqun.feng@...il.com>,
        "open list:LOCKING PRIMITIVES" <linux-kernel@...r.kernel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Subject: Re: [PATCH] rtmutex: ensure we wake up the top waiter

On Tue, Jan 17, 2023 at 4:32 PM Waiman Long <longman@...hat.com> wrote:
>
> On 1/17/23 12:26, Wander Lairson Costa wrote:
> > In task_blocked_on_lock() we save the owner, release the wait_lock and
> > call rt_mutex_adjust_prio_chain(). Before we acquire the wait_lock
> > again, the owner may release the lock and deboost.
> Are you referring to task_blocks_on_rt_mutex(), not task_blocked_on_lock()?
> >
> > rt_mutex_adjust_prio_chain() acquires the wait_lock. In the requeue
> > phase, waiter may be initially in the top of the queue, but after
> > dequeued and requeued it may no longer be true.
> >
> > This scenario ends up waking the wrong task, which will verify it is no
> > the top waiter and comes back to sleep. Now we have a situation in which
> > no task is holding the lock but no one acquires it.
> >
> > We can reproduce the bug in PREEMPT_RT with stress-ng:
> >
> > while true; do
> >      stress-ng --sched deadline --sched-period 1000000000 \
> >           --sched-runtime 800000000 --sched-deadline \
> >           1000000000 --mmapfork 23 -t 20
> > done
> >
> > Signed-off-by: Wander Lairson Costa <wander@...hat.com>
> > ---
> >   kernel/locking/rtmutex.c | 5 +++--
> >   1 file changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
> > index 010cf4e6d0b8..728f434de2bb 100644
> > --- a/kernel/locking/rtmutex.c
> > +++ b/kernel/locking/rtmutex.c
> > @@ -901,8 +901,9 @@ static int __sched rt_mutex_adjust_prio_chain(struct task_struct *task,
> >                * then we need to wake the new top waiter up to try
> >                * to get the lock.
> >                */
> > -             if (prerequeue_top_waiter != rt_mutex_top_waiter(lock))
> > -                     wake_up_state(waiter->task, waiter->wake_state);
> > +             top_waiter = rt_mutex_top_waiter(lock);
> > +             if (prerequeue_top_waiter != top_waiter)
> > +                     wake_up_state(top_waiter->task, top_waiter->wake_state);
> >               raw_spin_unlock_irq(&lock->wait_lock);
> >               return 0;
> >       }
>
> I would say that if a rt_mutex has no owner but have waiters, we should
> always wake up the top waiter whether or not it is the current waiter.

In practice, there is this case in which the unlock code wakes up the
top waiter, but before it task awakes, a third running task changes
the top waiter.

> So what is the result of the stress-ng run above? Is it a hang? It is
> not clear from your patch description.

It manifests as a rcu_preempt stall.

>
> I am not that familiar with the rt_mutex code, I am cc'ing Thomas and
> Sebastian for their input.
>
> Cheers,
> Longman
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ