lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 22 Jul 2020 16:42:34 -0700
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Hugh Dickins <hughd@...gle.com>, Oleg Nesterov <oleg@...hat.com>
Cc:     Michal Hocko <mhocko@...nel.org>, Linux-MM <linux-mm@...ck.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Tim Chen <tim.c.chen@...ux.intel.com>,
        Michal Hocko <mhocko@...e.com>
Subject: Re: [RFC PATCH] mm: silence soft lockups from unlock_page

On Wed, Jul 22, 2020 at 3:10 PM Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
>
> > +       bool first_time = true;
> >         bool thrashing = false;
> >         bool delayacct = false;
> >         unsigned long pflags;
> > @@ -1134,7 +1135,12 @@ static inline int wait_on_page_bit_commo
> >                 spin_lock_irq(&q->lock);
> >
> >                 if (likely(list_empty(&wait->entry))) {
> > -                       __add_wait_queue_entry_tail(q, wait);
> > +                       if (first_time) {
> > +                               __add_wait_queue_entry_tail(q, wait);
> > +                               first_time = false;
> > +                       } else {
> > +                               __add_wait_queue(q, wait);
> > +                       }
> >                         SetPageWaiters(page);
> >                 }
>
> This seems very hacky.
>
> And in fact, looking closer, I'd say that there are more serious problems here.
>
> Look at that WQ_FLAG_EXCLUSIVE thing: non-exclusive waits should
> always go at the head (because they're not going to steal the bit,
> they just want to know when it got cleared), and exclusive waits
> should always go at the tail (because of fairness).
>
> But that's not at all what we do.
>
> Your patch adds even more confusion to this nasty area.

Actually, I think the right thing to do is to just say "we should
never add ourselves back to the list".

We have three cases:

 - DROP: we'll never loop

 - SHARED: we'll break if we see the bit clear - so if we're no longer
on the list, we should break out

 - EXCLUSIVE: we should remain on the list until we get the lock.

and we should just handle these three cases in the wakeup function
instead, I feel.

And then to make it fair to people, let's just always add things to
the head of the queue, whether exclusive or not.

AND NEVER RE-QUEUE.

IOW, something like the attached patch.

NOTE NOTE NOTE! This is both somewhat subtle code, and ENTIRELY
UNTESTED. But it actually fixes the bug mentioned a few weeks ago, it
makes the code simpler in one sense, and it avoids the whole
re-queueuing thing.

It removes all the "is the page locked" testing from the actual wait
loop, and replaces it with "is the wait queue entry still on the list"
instead.

Comments? Oleg, this should fix the race you talked about too.

Note that this patch really is meant as a RFC, and "something like
this". It builds. I tried to think it through. But it might have some
obvious thinko that means that it really really doesn't work.

                    Linus

Download attachment "patch" of type "application/octet-stream" (6408 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ