lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 29 Aug 2017 10:33:52 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Benjamin Herrenschmidt <benh@...nel.crashing.org>
Cc:     "Kirill A. Shutemov" <kirill@...temov.name>,
        Laurent Dufour <ldufour@...ux.vnet.ibm.com>,
        paulmck@...ux.vnet.ibm.com, akpm@...ux-foundation.org,
        ak@...ux.intel.com, mhocko@...nel.org, dave@...olabs.net,
        jack@...e.cz, Matthew Wilcox <willy@...radead.org>,
        mpe@...erman.id.au, paulus@...ba.org,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, hpa@...or.com,
        Will Deacon <will.deacon@....com>,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        haren@...ux.vnet.ibm.com, khandual@...ux.vnet.ibm.com,
        npiggin@...il.com, bsingharora@...il.com,
        Tim Chen <tim.c.chen@...ux.intel.com>,
        linuxppc-dev@...ts.ozlabs.org, x86@...nel.org
Subject: Re: [PATCH v2 14/20] mm: Provide speculative fault infrastructure

On Tue, Aug 29, 2017 at 07:14:37AM +1000, Benjamin Herrenschmidt wrote:
> On Mon, 2017-08-28 at 11:37 +0200, Peter Zijlstra wrote:
> > > Doing all this job and just give up because we cannot allocate page tables
> > > looks very wasteful to me.
> > > 
> > > Have you considered to look how we can hand over from speculative to
> > > non-speculative path without starting from scratch (when possible)?
> > 
> > So we _can_ in fact allocate and install page-tables, but we have to be
> > very careful about it. The interesting case is where we race with
> > free_pgtables() and install a page that was just taken out.
> > 
> > But since we already have the VMA I think we can do something like:
> 
> That makes me extremely nervous... there could be all sort of
> assumptions esp. in arch code about the fact that we never populate the
> tree without the mm sem.

That _would_ be somewhat dodgy, because that means it needs to rely on
taking mmap_sem for _writing_ to undo things and arch/powerpc/ doesn't
have many down_write.*mmap_sem:

$ git grep "down_write.*mmap_sem" arch/powerpc/
arch/powerpc/kernel/vdso.c:     if (down_write_killable(&mm->mmap_sem))
arch/powerpc/kvm/book3s_64_vio.c:       down_write(&current->mm->mmap_sem);
arch/powerpc/mm/mmu_context_iommu.c:    down_write(&mm->mmap_sem);
arch/powerpc/mm/subpage-prot.c: down_write(&mm->mmap_sem);
arch/powerpc/mm/subpage-prot.c: down_write(&mm->mmap_sem);
arch/powerpc/mm/subpage-prot.c:         down_write(&mm->mmap_sem);

Then again, I suppose it could be relying on the implicit down_write
from things like munmap() and the like..

And things _ought_ to be ordered by the various PTLs
(mm->page_table_lock and pmd->lock) which of course doesn't mean
something accidentally snuck through.

> We'd have to audit archs closely. Things like the page walk cache
> flushing on power etc...

If you point me where to look, I'll have a poke around. I'm not
quite sure what you mean with pagewalk cache flushing. Your hash thing
flushes everything inside the PTL IIRC and the radix code appears fairly
'normal'.

> I don't mind the "retry" .. .we've brought stuff in the L1 cache
> already which I would expect to be the bulk of the overhead, and the
> allocation case isn't that common. Do we have numbers to show how
> destrimental this is today ?

No numbers, afaik. And like I said, I didn't consider this an actual
problem when I did these patches. But since Kirill asked ;-)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ