linux-kernel - Re: [PATCH 04/19] sched: Prepare for Core-wide rq->lock

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CABk29NvaH687GfOm_b5_hJF6HBQ6fu+1hzc0GFNEMv5mj3DrUw@mail.gmail.com>
Date:   Tue, 27 Apr 2021 16:30:02 -0700
From:   Josh Don <joshdon@...gle.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Joel Fernandes <joel@...lfernandes.org>,
        "Hyser,Chris" <chris.hyser@...cle.com>,
        Ingo Molnar <mingo@...nel.org>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Valentin Schneider <valentin.schneider@....com>,
        Mel Gorman <mgorman@...e.de>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Thomas Gleixner <tglx@...utronix.de>, dhiatt@...italocean.com
Subject: Re: [PATCH 04/19] sched: Prepare for Core-wide rq->lock

On Mon, Apr 26, 2021 at 3:21 PM Josh Don <joshdon@...gle.com> wrote:
>
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index f732642e3e09..1a81e9cc9e5d 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -290,6 +290,10 @@ static void sched_core_assert_empty(void)
> >  static void __sched_core_enable(void)
> >  {
> >         static_branch_enable(&__sched_core_enabled);
> > +       /*
> > +        * Ensure raw_spin_rq_*lock*() have completed before flipping.
> > +        */
> > +       synchronize_sched();
>
> synchronize_rcu()
>
> >         __sched_core_flip(true);
> >         sched_core_assert_empty();
> >  }
> > @@ -449,16 +453,22 @@ void raw_spin_rq_lock_nested(struct rq *rq, int subclass)
> >  {
> >         raw_spinlock_t *lock;
> >
> > +       preempt_disable();
> >         if (sched_core_disabled()) {
> >                 raw_spin_lock_nested(&rq->__lock, subclass);
> > +               /* preempt *MUST* still be disabled here */
> > +               preempt_enable_no_resched();
> >                 return;
> >         }
>
> This approach looks good to me. I'm guessing you went this route
> instead of doing the re-check after locking in order to optimize the
> disabled case?
>
> Recommend a comment that the preempt_disable() here pairs with the
> synchronize_rcu() in __sched_core_enable().
>
> >
> >         for (;;) {
> >                 lock = __rq_lockp(rq);
> >                 raw_spin_lock_nested(lock, subclass);
> > -               if (likely(lock == __rq_lockp(rq)))
> > +               if (likely(lock == __rq_lockp(rq))) {
> > +                       /* preempt *MUST* still be disabled here */
> > +                       preempt_enable_no_resched();
> >                         return;
> > +               }
> >                 raw_spin_unlock(lock);
> >         }
> >  }

Also, did you mean to have a preempt_enable_no_resched() rather than
prempt_enable() in raw_spin_rq_trylock?

I went over the rq_lockp stuff again after Don's reported lockup. Most
uses are safe due to already holding an rq lock. However,
double_rq_unlock() is prone to race:

double_rq_unlock(rq1, rq2):
/* Initial state: core sched enabled, and rq1 and rq2 are smt
siblings. So, double_rq_lock(rq1, rq2) only took a single rq lock */
raw_spin_rq_unlock(rq1);
/* now not holding any rq lock */
/* sched core disabled. Now __rq_lockp(rq1) != __rq_lockp(rq2), so we
falsely unlock rq2 */
if (__rq_lockp(rq1) != __rq_lockp(rq2))
        raw_spin_rq_unlock(rq2);
else
        __release(rq2->lock);

Instead we can cache __rq_lockp(rq1) and __rq_lockp(rq2) before
releasing the lock, in order to prevent this. FWIW I think it is
likely that Don is seeing a different issue.