[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1406887459.4935.236.camel@pasglop>
Date: Fri, 01 Aug 2014 20:04:19 +1000
From: Benjamin Herrenschmidt <benh@...nel.crashing.org>
To: Davidlohr Bueso <davidlohr@...com>
Cc: Alexey Kardashevskiy <aik@...abs.ru>,
Andrew Morton <akpm@...ux-foundation.org>,
"Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
Rik van Riel <riel@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Mel Gorman <mgorman@...e.de>,
Johannes Weiner <hannes@...xchg.org>,
Andrea Arcangeli <aarcange@...hat.com>,
Sasha Levin <sasha.levin@...cle.com>,
Wanpeng Li <liwanp@...ux.vnet.ibm.com>,
Vlastimil Babka <vbabka@...e.cz>,
Jörn Engel <joern@...fs.org>,
"Paul E . McKenney" <paulmck@...ux.vnet.ibm.com>,
linux-mm@...ck.org, linux-kernel@...r.kernel.org,
Alex Williamson <alex.williamson@...hat.com>,
Alexander Graf <agraf@...e.de>,
Michael Ellerman <michael@...erman.id.au>
Subject: Re: [RFC PATCH] mm: Add helpers for locked_vm
On Wed, 2014-07-30 at 03:31 -0700, Davidlohr Bueso wrote:
> It doesn't strike me that this is the place for this. It would seem that
> it would be the caller's responsibility to make sure of this (and not
> sure how !current can happen...).
>
> > +
> > + down_write(¤t->mm->mmap_sem);
> > + locked = current->mm->locked_vm + npages;
> > + lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
>
> nit: please set locked and lock_limit before taking the mmap_sem.
Won't it be racy to read current->mm->locked_vm without the sem ?
> > + if (locked > lock_limit && !capable(CAP_IPC_LOCK)) {
> > + pr_warn("RLIMIT_MEMLOCK (%ld) exceeded\n",
> > + rlimit(RLIMIT_MEMLOCK));
> > + ret = -ENOMEM;
> > + } else {
>
> It would be nicer to have it the other way around, leave the #else for
> ENOMEM. It reads better, imho.
>
> > + current->mm->locked_vm += npages;
>
> More importantly just setting locked_vm is not enough. You'll need to
> call do_mlock() here (again, addr granularity ;). This also applies to
> your decrement_locked_vm().
Do we need to actually do mlock ? Basically this is VFIO doing
get_user_pages on a pile of guest/user memory, we are trying to account
for it, but I don't think we need the whole mlock business on top of it
Also address granularity cannot work. We basically predictively account
how much the guest can lock, but we won't know how much it actually
locks until he actually does DMA mappings which is a fairly fast path.
In some cases, I think (Alexey, correct me if I'm wrong), we are trying
to account for kernel memory allocated on behalf of the guest, which is
not necessarily mapped as normal VMAs, it's mostly a way to prevent
a stray KVM/qemu guest from causing the kernel to allocate a ton of
pinned memory by accounting it as part of the locked memory limits.
Ben.
> Thanks,
> Davidlohr
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists