linux-kernel - Re: [PATCH next] sbitmap: fix lockup while swapping

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <929a3aba-72b0-5e-5b80-824a2b7f5dc7@google.com>
Date:   Fri, 23 Sep 2022 16:15:29 -0700 (PDT)
From:   Hugh Dickins <hughd@...gle.com>
To:     Keith Busch <kbusch@...nel.org>
cc:     Hugh Dickins <hughd@...gle.com>, Jan Kara <jack@...e.cz>,
        Jens Axboe <axboe@...nel.dk>,
        Yu Kuai <yukuai1@...weicloud.com>,
        Liu Song <liusong@...ux.alibaba.com>,
        linux-block@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH next] sbitmap: fix lockup while swapping

On Fri, 23 Sep 2022, Hugh Dickins wrote:
> On Fri, 23 Sep 2022, Keith Busch wrote:
> 
> > Does the following fix the observation? Rational being that there's no reason
> > to spin on the current wait state that is already under handling; let
> > subsequent clearings proceed to the next inevitable wait state immediately.
> 
> It's running fine without lockup so far; but doesn't this change merely
> narrow the window?  If this is interrupted in between atomic_try_cmpxchg()
> setting wait_cnt to 0 and sbq_index_atomic_inc() advancing wake_index,
> don't we run the same risk as before, of sbitmap_queue_wake_up() from
> the interrupt handler getting stuck on that wait_cnt 0?

Yes, it ran successfully for 50 minutes, then an interrupt came in
immediately after the cmpxchg, and it locked up just as before.

Easily dealt with by disabling interrupts, no doubt, but I assume it's a
badge of honour not to disable interrupts here (except perhaps in waking).

Some clever way to make the wait_cnt and wake_index adjustments atomic?

Or is this sbitmap_queue_wake_up() interrupting sbitmap_queue_wake_up()
just supposed never to happen, the counts preventing it: but some
misaccounting letting it happen by mistake?

> 
> > 
> > ---
> > diff --git a/lib/sbitmap.c b/lib/sbitmap.c
> > index 624fa7f118d1..47bf7882210b 100644
> > --- a/lib/sbitmap.c
> > +++ b/lib/sbitmap.c
> > @@ -634,6 +634,13 @@ static bool __sbq_wake_up(struct sbitmap_queue *sbq, int *nr)
> >  
> >  	*nr -= sub;
> >  
> > +	/*
> > +	 * Increase wake_index before updating wait_cnt, otherwise concurrent
> > +	 * callers can see valid wait_cnt in old waitqueue, which can cause
> > +	 * invalid wakeup on the old waitqueue.
> > +	 */
> > +	sbq_index_atomic_inc(&sbq->wake_index);
> > +
> >  	/*
> >  	 * When wait_cnt == 0, we have to be particularly careful as we are
> >  	 * responsible to reset wait_cnt regardless whether we've actually
> > @@ -660,13 +667,6 @@ static bool __sbq_wake_up(struct sbitmap_queue *sbq, int *nr)
> >  	 * of atomic_set().
> >  	 */
> >  	smp_mb__before_atomic();
> > -
> > -	/*
> > -	 * Increase wake_index before updating wait_cnt, otherwise concurrent
> > -	 * callers can see valid wait_cnt in old waitqueue, which can cause
> > -	 * invalid wakeup on the old waitqueue.
> > -	 */
> > -	sbq_index_atomic_inc(&sbq->wake_index);
> >  	atomic_set(&ws->wait_cnt, wake_batch);
> >  
> >  	return ret || *nr;
> > --
>