[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190214060006.GE24692@ziepe.ca>
Date: Wed, 13 Feb 2019 23:00:06 -0700
From: Jason Gunthorpe <jgg@...pe.ca>
To: Ira Weiny <ira.weiny@...el.com>
Cc: Daniel Jordan <daniel.m.jordan@...cle.com>,
akpm@...ux-foundation.org, dave@...olabs.net, jack@...e.cz,
cl@...ux.com, linux-mm@...ck.org, kvm@...r.kernel.org,
kvm-ppc@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org,
linux-fpga@...r.kernel.org, linux-kernel@...r.kernel.org,
alex.williamson@...hat.com, paulus@...abs.org,
benh@...nel.crashing.org, mpe@...erman.id.au, hao.wu@...el.com,
atull@...nel.org, mdf@...nel.org, aik@...abs.ru
Subject: Re: [PATCH 0/5] use pinned_vm instead of locked_vm to account pinned
pages
On Wed, Feb 13, 2019 at 05:53:14PM -0800, Ira Weiny wrote:
> On Mon, Feb 11, 2019 at 03:54:47PM -0700, Jason Gunthorpe wrote:
> > On Mon, Feb 11, 2019 at 05:44:32PM -0500, Daniel Jordan wrote:
> >
> > > All five of these places, and probably some of Davidlohr's conversions,
> > > probably want to be collapsed into a common helper in the core mm for
> > > accounting pinned pages. I tried, and there are several details that
> > > likely need discussion, so this can be done as a follow-on.
> >
> > I've wondered the same..
>
> I'm really thinking this would be a nice way to ensure it gets cleaned up and
> does not happen again.
>
> Also, by moving it to the core we could better manage any user visible changes.
>
> From a high level, pinned is a subset of locked so it seems like we need a 2
> sets of helpers.
>
> try_increment_locked_vm(...)
> decrement_locked_vm(...)
>
> try_increment_pinned_vm(...)
> decrement_pinned_vm(...)
>
> Where try_increment_pinned_vm() also increments locked_vm... Of course this
> may end up reverting the improvement of Davidlohr Bueso's atomic work... :-(
>
> Furthermore it would seem better (although I don't know if at all possible) if
> this were accounted for in core calls which tracked them based on how the pages
> are being used so that drivers can't call try_increment_locked_vm() and then
> pin the pages... Thus getting the account wrong vs what actually happened.
>
> And then in the end we can go back to locked_vm being the value checked against
> RLIMIT_MEMLOCK.
Someone would need to understand the bug that was fixed by splitting
them.
I think it had to do with double accounting pinned and mlocked pages
and thus delivering a lower than expected limit to userspace.
vfio has this bug, RDMA does not. RDMA has a bug where it can
overallocate locked memory, vfio doesn't.
Really unclear how to fix this. The pinned/locked split with two
buckets may be the right way.
Jason
Powered by blists - more mailing lists