lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 23 Apr 2019 15:42:22 +0200
From:   Michal Hocko <mhocko@...nel.org>
To:     Matthew Wilcox <willy@...radead.org>
Cc:     Michel Lespinasse <walken@...gle.com>,
        Laurent Dufour <ldufour@...ux.ibm.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Peter Zijlstra <peterz@...radead.org>,
        "Kirill A. Shutemov" <kirill@...temov.name>,
        Andi Kleen <ak@...ux.intel.com>, dave@...olabs.net,
        Jan Kara <jack@...e.cz>, aneesh.kumar@...ux.ibm.com,
        Benjamin Herrenschmidt <benh@...nel.crashing.org>,
        mpe@...erman.id.au, Paul Mackerras <paulus@...ba.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>,
        "H. Peter Anvin" <hpa@...or.com>,
        Will Deacon <will.deacon@....com>,
        Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
        sergey.senozhatsky.work@...il.com,
        Andrea Arcangeli <aarcange@...hat.com>,
        Alexei Starovoitov <alexei.starovoitov@...il.com>,
        kemi.wang@...el.com, Daniel Jordan <daniel.m.jordan@...cle.com>,
        David Rientjes <rientjes@...gle.com>,
        Jerome Glisse <jglisse@...hat.com>,
        Ganesh Mahendran <opensource.ganesh@...il.com>,
        Minchan Kim <minchan@...nel.org>,
        Punit Agrawal <punitagrawal@...il.com>,
        vinayak menon <vinayakm.list@...il.com>,
        Yang Shi <yang.shi@...ux.alibaba.com>,
        zhong jiang <zhongjiang@...wei.com>,
        Haiyan Song <haiyanx.song@...el.com>,
        Balbir Singh <bsingharora@...il.com>, sj38.park@...il.com,
        Mike Rapoport <rppt@...ux.ibm.com>,
        LKML <linux-kernel@...r.kernel.org>,
        linux-mm <linux-mm@...ck.org>, haren@...ux.vnet.ibm.com,
        Nick Piggin <npiggin@...il.com>,
        "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
        Tim Chen <tim.c.chen@...ux.intel.com>,
        linuxppc-dev@...ts.ozlabs.org, x86@...nel.org
Subject: Re: [PATCH v12 00/31] Speculative page faults

On Tue 23-04-19 05:41:48, Matthew Wilcox wrote:
> On Tue, Apr 23, 2019 at 12:47:07PM +0200, Michal Hocko wrote:
> > On Mon 22-04-19 14:29:16, Michel Lespinasse wrote:
> > [...]
> > > I want to add a note about mmap_sem. In the past there has been
> > > discussions about replacing it with an interval lock, but these never
> > > went anywhere because, mostly, of the fact that such mechanisms were
> > > too expensive to use in the page fault path. I think adding the spf
> > > mechanism would invite us to revisit this issue - interval locks may
> > > be a great way to avoid blocking between unrelated mmap_sem writers
> > > (for example, do not delay stack creation for new threads while a
> > > large mmap or munmap may be going on), and probably also to handle
> > > mmap_sem readers that can't easily use the spf mechanism (for example,
> > > gup callers which make use of the returned vmas). But again that is a
> > > separate topic to explore which doesn't have to get resolved before
> > > spf goes in.
> > 
> > Well, I believe we should _really_ re-evaluate the range locking sooner
> > rather than later. Why? Because it looks like the most straightforward
> > approach to the mmap_sem contention for most usecases I have heard of
> > (mostly a mm{unm}ap, mremap standing in the way of page faults).
> > On a plus side it also makes us think about the current mmap (ab)users
> > which should lead to an overall code improvements and maintainability.
> 
> Dave Chinner recently did evaluate the range lock for solving a problem
> in XFS and didn't like what he saw:
> 
> https://lore.kernel.org/linux-fsdevel/20190418031013.GX29573@dread.disaster.area/T/#md981b32c12a2557a2dd0f79ad41d6c8df1f6f27c

Thank you, will have a look.

> I think scaling the lock needs to be tied to the actual data structure
> and not have a second tree on-the-side to fake-scale the locking.  Anyway,
> we're going to have a session on this at LSFMM, right?

I thought we had something for the mmap_sem scaling but I do not see
this in a list of proposed topics. But we can certainly add it there.

> > SPF sounds like a good idea but it is a really big and intrusive surgery
> > to the #PF path. And more importantly without any real world usecase
> > numbers which would justify this. That being said I am not opposed to
> > this change I just think it is a large hammer while we haven't seen
> > attempts to tackle problems in a simpler way.
> 
> I don't think the "no real world usecase numbers" is fair.  Laurent quoted:
> 
> > Ebizzy:
> > -------
> > The test is counting the number of records per second it can manage, the
> > higher is the best. I run it like this 'ebizzy -mTt <nrcpus>'. To get
> > consistent result I repeated the test 100 times and measure the average
> > result. The number is the record processes per second, the higher is the best.
> > 
> >   		BASE		SPF		delta	
> > 24 CPUs x86	5492.69		9383.07		70.83%
> > 1024 CPUS P8 VM 8476.74		17144.38	102%
> 
> and cited 30% improvement for you-know-what product from an earlier
> version of the patch.

Well, we are talking about
45 files changed, 1277 insertions(+), 196 deletions(-)

which is a _major_ surgery in my book. Having a real life workloads numbers
is nothing unfair to ask for IMHO.

And let me remind you that I am not really opposing SPF in general. I
would just like to see a simpler approach before we go such a large
change. If the range locking is not really a scalable approach then all
right but from why I've see it should help a lot of most bottle-necks I
have seen.
-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ