lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 20 Nov 2018 19:20:40 -0800 (PST)
From:   Hugh Dickins <hughd@...gle.com>
To:     Vlastimil Babka <vbabka@...e.cz>
cc:     Hugh Dickins <hughd@...gle.com>, Baoquan He <bhe@...hat.com>,
        Michal Hocko <mhocko@...nel.org>,
        David Hildenbrand <david@...hat.com>, linux-mm@...ck.org,
        pifang@...hat.com, linux-kernel@...r.kernel.org,
        akpm@...ux-foundation.org, aarcange@...hat.com,
        Mel Gorman <mgorman@...e.de>
Subject: Re: Memory hotplug softlock issue

On Tue, 20 Nov 2018, Hugh Dickins wrote:
> On Tue, 20 Nov 2018, Vlastimil Babka wrote:
> > >  
> > >  	finish_wait(q, wait);
> > 
> > ... the code continues by:
> > 
> >         if (thrashing) {
> >                 if (!PageSwapBacked(page))
> > 
> > So maybe we should not set 'thrashing' true when lock < 0?
> 
> Very good catch, thank you Vlastimil: as you might have guessed, the
> patch from a pre-PSI kernel applied cleanly, and I just hadn't reviewed
> the surrounding context properly before sending out.
> 
> I cannot say immediately what the right answer is, I'll have to do some
> research first: maybe not enter the block that sets thrashing true when
> lock < 0, as you suggest, or maybe force lock < 0 to 0 and put_page()
> afterwards, or... 

... I still won't adjust the patch tonight, but the answer is obvious
now I look closer: as you show in your extract above, the only thing
it does with "page" at the end is to ask if it was SwapBacked, so we
just need to set one more bool at the beginning to check at the end
(or I could make "thrashing" a -1, 0, 1 int like "lock": but my guess
is that that would not be to other people's taste: acceptable for the
arg, but stretching your patience for the local variable).

By the way, I do have a further patch to wait_on_page_bit_common(),
which I could send at the same time, if it sounds right to you
(but it's a no-op in the put_and_wait page migration case).  That
__add_wait_queue_entry_tail() is right for the first time into
the loop, but I maintain that it should use __add_wait_queue()
for fairness thereafter, to avoid repeatedly sending older waiters
back to the back of the queue.  I don't have hard numbers for it,
but it's one of several patches, each of which helped to reduce our
wait_on_page_bit lockups in some (perhaps unrealistic) stress tests.

Hugh

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ