lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 6 Jan 2022 14:05:27 -0700
From:   Alex Williamson <alex.williamson@...hat.com>
To:     Jason Gunthorpe <jgg@...dia.com>
Cc:     Daniel Jordan <daniel.m.jordan@...cle.com>,
        Alexander Duyck <alexanderduyck@...com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Ben Segall <bsegall@...gle.com>,
        Cornelia Huck <cohuck@...hat.com>,
        Dan Williams <dan.j.williams@...el.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Herbert Xu <herbert@...dor.apana.org.au>,
        Ingo Molnar <mingo@...hat.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Josh Triplett <josh@...htriplett.org>,
        Michal Hocko <mhocko@...e.com>, Nico Pache <npache@...hat.com>,
        Pasha Tatashin <pasha.tatashin@...een.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Steffen Klassert <steffen.klassert@...unet.com>,
        Steve Sistare <steven.sistare@...cle.com>,
        Tejun Heo <tj@...nel.org>,
        Tim Chen <tim.c.chen@...ux.intel.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        linux-mm@...ck.org, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org, linux-crypto@...r.kernel.org
Subject: Re: [RFC 08/16] vfio/type1: Cache locked_vm to ease mmap_lock
 contention

On Thu, 6 Jan 2022 08:34:56 -0400
Jason Gunthorpe <jgg@...dia.com> wrote:

> On Wed, Jan 05, 2022 at 08:17:08PM -0500, Daniel Jordan wrote:
> > On Wed, Jan 05, 2022 at 08:53:39PM -0400, Jason Gunthorpe wrote:  
> > > On Wed, Jan 05, 2022 at 07:46:48PM -0500, Daniel Jordan wrote:  
> > > > padata threads hold mmap_lock as reader for the majority of their
> > > > runtime in order to call pin_user_pages_remote(), but they also
> > > > periodically take mmap_lock as writer for short periods to adjust
> > > > mm->locked_vm, hurting parallelism.
> > > > 
> > > > Alleviate the write-side contention with a per-thread cache of locked_vm
> > > > which allows taking mmap_lock as writer far less frequently.
> > > > 
> > > > Failure to refill the cache due to insufficient locked_vm will not cause
> > > > the entire pinning operation to error out.  This avoids spurious failure
> > > > in case some pinned pages aren't accounted to locked_vm.
> > > > 
> > > > Cache size is limited to provide some protection in the unlikely event
> > > > of a concurrent locked_vm accounting operation in the same address space
> > > > needlessly failing in case the cache takes more locked_vm than it needs.  
> > > 
> > > Why not just do the pinned page accounting once at the start? Why does
> > > it have to be done incrementally?  
> > 
> > Yeah, good question.  I tried doing it that way recently and it did
> > improve performance a bit, but I thought it wasn't enough of a gain to
> > justify how it overaccounted by the size of the entire pin.  
> 
> Why would it over account?

We'd be guessing that the entire virtual address mapping counts against
locked memory limits, but it might include PFNMAP pages or pages that
are already account via the page pinning interface that mdev devices
use.  At that point we're risking that the user isn't concurrently
doing something else that could fail as a result of pre-accounting and
fixup later schemes like this.  Thanks,

Alex

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ