lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 22 Jul 2020 15:10:44 -0700
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Hugh Dickins <hughd@...gle.com>
Cc:     Michal Hocko <mhocko@...nel.org>, Linux-MM <linux-mm@...ck.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Tim Chen <tim.c.chen@...ux.intel.com>,
        Michal Hocko <mhocko@...e.com>
Subject: Re: [RFC PATCH] mm: silence soft lockups from unlock_page

On Wed, Jul 22, 2020 at 2:29 PM Hugh Dickins <hughd@...gle.com> wrote:
>
> -#define PAGE_WAIT_TABLE_BITS 8
> +#define PAGE_WAIT_TABLE_BITS 10

Well, that seems harmless even on small machines.

> +       bool first_time = true;
>         bool thrashing = false;
>         bool delayacct = false;
>         unsigned long pflags;
> @@ -1134,7 +1135,12 @@ static inline int wait_on_page_bit_commo
>                 spin_lock_irq(&q->lock);
>
>                 if (likely(list_empty(&wait->entry))) {
> -                       __add_wait_queue_entry_tail(q, wait);
> +                       if (first_time) {
> +                               __add_wait_queue_entry_tail(q, wait);
> +                               first_time = false;
> +                       } else {
> +                               __add_wait_queue(q, wait);
> +                       }
>                         SetPageWaiters(page);
>                 }

This seems very hacky.

And in fact, looking closer, I'd say that there are more serious problems here.

Look at that WQ_FLAG_EXCLUSIVE thing: non-exclusive waits should
always go at the head (because they're not going to steal the bit,
they just want to know when it got cleared), and exclusive waits
should always go at the tail (because of fairness).

But that's not at all what we do.

Your patch adds even more confusion to this nasty area.

And your third one:

> +               if (ret)
> +                       woken++;
>
> -               if (bookmark && (++cnt > WAITQUEUE_WALK_BREAK_CNT) &&
> +               if (bookmark && (++cnt > WAITQUEUE_WALK_BREAK_CNT) && woken &&

I've got two reactions to this

 (a) we should not need a new "woken" variable, we should just set a
high bit of "cnt" and make WAITQUEUE_WALK_BREAK_CNT contain that high
bit

     (Tune "high bit" to whatever you want: it could be either the
_real_ high bit of the variable, or it could be something like "128",
which would mean that you'd break out after 128 non-waking entries).

 (b) Ugh, what hackery and magic behavior regardless

I'm really starting to hate that wait_on_page_bit_common() function.

See a few weeks ago how the function looks buggy to begin with

  https://lore.kernel.org/lkml/CAHk-=wjJA2Z3kUFb-5s=6+n0qbTs8ELqKFt9B3pH85a8fGD73w@mail.gmail.com/

and that never got resolved either (but probably never happens in practice).

              Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ