lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wg+7Wk7BLmLiDkoNDBvMfoooDVEaLimDY+10Jr9jLKLZg@mail.gmail.com>
Date:   Sun, 26 Jul 2020 13:41:16 -0700
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Hugh Dickins <hughd@...gle.com>
Cc:     Oleg Nesterov <oleg@...hat.com>, Michal Hocko <mhocko@...nel.org>,
        Linux-MM <linux-mm@...ck.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Tim Chen <tim.c.chen@...ux.intel.com>,
        Michal Hocko <mhocko@...e.com>
Subject: Re: [RFC PATCH] mm: silence soft lockups from unlock_page

On Sun, Jul 26, 2020 at 1:30 PM Hugh Dickins <hughd@...gle.com> wrote:
>
> I've deduced nothing useful from the logs, will have to leave that
> to others here with more experience of them.  But my assumption now
> is that you have successfully removed one bottleneck, so the tests
> get somewhat further and now stick in the next bottleneck, whatever
> that may be.  Which shows up as "failure", where the unlock_page()
> wake_up_page_bit() bottleneck had allowed the tests to proceed in
> a more serially sedate way.

Well, that's the very optimistic reading.

As the optimistic and happy person I am (hah!) I'm going to agree with
you, and plan on just merging that patch early in the next merge
window. It may fix a real bug in the current trere, but it's much too
late to apply right now, particularly with your somewhat ambiguous
results.

Oleg's theoretical race has probably never been seen, and while the
watchdog triggering is clearly a real bug, it's also extreme enough
not to really be a strong argument for merging this out-of-window..

> The xhci handle_cmd_completion list_del bugs (on an older version
> of the driver): weird, nothing to do with page wakeups, I'll just
> have to assume that it's some driver bug exposed by the greater
> stress allowed down, and let driver people investigate (if it
> still manifests) when we take in your improvements.

Do you have the bug-report, just to google against anybody else
reporting something simialr>

> One nice thing from the comparison runs without your patches:
> watchdog panic did crash one of those with exactly the unlock_page()
> wake_up_page_bit() softlockup symptom we've been fighting, that did
> not appear with your patches.  So although the sample size is much
> too small to justify a conclusion, it does tend towards confirming
> your changes.

You win some, you lose some. But yes, I'll take that as a tentative
success and that the approach is valid.

Thanks,

               Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ