linux-kernel - Re: [PATCH 2/2] mm: THP page cache support for ppc64

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <alpine.LSU.2.11.1611111702170.10776@eggly.anvils>
Date:   Fri, 11 Nov 2016 17:37:03 -0800 (PST)
From:   Hugh Dickins <hughd@...gle.com>
To:     "Kirill A. Shutemov" <kirill@...temov.name>
cc:     "Aneesh Kumar K.V" <aneesh.kumar@...ux.vnet.ibm.com>,
        Hugh Dickins <hughd@...gle.com>, akpm@...ux-foundation.org,
        benh@...nel.crashing.org, paulus@...ba.org, mpe@...erman.id.au,
        linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        linuxppc-dev@...ts.ozlabs.org
Subject: Re: [PATCH 2/2] mm: THP page cache support for ppc64

On Fri, 11 Nov 2016, Kirill A. Shutemov wrote:
> On Fri, Nov 11, 2016 at 05:42:11PM +0530, Aneesh Kumar K.V wrote:
> > 
> > doing this in do_set_pmd keeps this closer to where we set the pmd. Any
> > reason you thing we should move it higher up the stack. We already do
> > pte_alloc() at the same level for a non transhuge case in
> > alloc_set_pte().
> 
> I vaguely remember Hugh mentioned deadlock of allocation under page-lock vs.
> OOM-killer (or something else?).

You remember well.  It was indeed the OOM killer, but in particular due
to the way it used to wait for a current victim to exit, and that exit
could be delayed forever by the way munlock_vma_pages_all() goes to lock
each page in a VM_LOCKED area - a pity if one of them is the page we
hold locked while servicing a fault and need to allocate a pagetable.

> 
> If the deadlock is still there it would be matter of making preallocation
> unconditional to fix the issue.

I think enough has changed at the OOM killer end that the deadlock is
no longer there.  I haven't kept up with all the changes made recently,
but I think we no longer wait for a unique victim to exit before trying
another (reaped mms set MMF_OOM_SKIP); and the OOM reaper skips over
VM_LOCKED areas to avoid just such a deadlock.

It's still silly that munlock_vma_pages_all() should require page lock
on each of those pages; but neither Michal nor I have had time to
revisit our attempts to relieve that requirement - mlock.c is not easy.

> 
> But what you propose about doesn't make situation any worse. I'm fine with
> that.

Yes, I think that's right: if there is a problem, then it would already
be problem since alloc_set_pte() was created; but we've seen no reports.

Hugh