linux-kernel - Re: [RFC 08/16] vfio/type1: Cache locked_vm to ease mmap

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20220106011708.6ajbhzgreevu62gl@oracle.com>
Date:   Wed, 5 Jan 2022 20:17:08 -0500
From:   Daniel Jordan <daniel.m.jordan@...cle.com>
To:     Jason Gunthorpe <jgg@...dia.com>
Cc:     Alexander Duyck <alexanderduyck@...com>,
        Alex Williamson <alex.williamson@...hat.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Ben Segall <bsegall@...gle.com>,
        Cornelia Huck <cohuck@...hat.com>,
        Dan Williams <dan.j.williams@...el.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Herbert Xu <herbert@...dor.apana.org.au>,
        Ingo Molnar <mingo@...hat.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Josh Triplett <josh@...htriplett.org>,
        Michal Hocko <mhocko@...e.com>, Nico Pache <npache@...hat.com>,
        Pasha Tatashin <pasha.tatashin@...een.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Steffen Klassert <steffen.klassert@...unet.com>,
        Steve Sistare <steven.sistare@...cle.com>,
        Tejun Heo <tj@...nel.org>,
        Tim Chen <tim.c.chen@...ux.intel.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        linux-mm@...ck.org, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org, linux-crypto@...r.kernel.org
Subject: Re: [RFC 08/16] vfio/type1: Cache locked_vm to ease mmap_lock
 contention

On Wed, Jan 05, 2022 at 08:53:39PM -0400, Jason Gunthorpe wrote:
> On Wed, Jan 05, 2022 at 07:46:48PM -0500, Daniel Jordan wrote:
> > padata threads hold mmap_lock as reader for the majority of their
> > runtime in order to call pin_user_pages_remote(), but they also
> > periodically take mmap_lock as writer for short periods to adjust
> > mm->locked_vm, hurting parallelism.
> > 
> > Alleviate the write-side contention with a per-thread cache of locked_vm
> > which allows taking mmap_lock as writer far less frequently.
> > 
> > Failure to refill the cache due to insufficient locked_vm will not cause
> > the entire pinning operation to error out.  This avoids spurious failure
> > in case some pinned pages aren't accounted to locked_vm.
> > 
> > Cache size is limited to provide some protection in the unlikely event
> > of a concurrent locked_vm accounting operation in the same address space
> > needlessly failing in case the cache takes more locked_vm than it needs.
> 
> Why not just do the pinned page accounting once at the start? Why does
> it have to be done incrementally?

Yeah, good question.  I tried doing it that way recently and it did
improve performance a bit, but I thought it wasn't enough of a gain to
justify how it overaccounted by the size of the entire pin.

If the concurrent accounting I worried about above isn't really a
concern, though, I can reconsider this.