[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAL1RGDWZ2LYO7ejPs9FvDzqze43cbfUEEdQVB=Ug2n3JpEe=AQ@mail.gmail.com>
Date: Thu, 9 Feb 2012 09:50:49 -0800
From: Roland Dreier <roland@...nel.org>
To: Hugh Dickins <hughd@...gle.com>
Cc: linux-rdma@...r.kernel.org, Andrea Arcangeli <aarcange@...hat.com>,
linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH/RFC G-U-P experts] IB/umem: Modernize our get_user_pages() parameters
On Wed, Feb 8, 2012 at 3:10 PM, Hugh Dickins <hughd@...gle.com> wrote:
> A doubt assaulted me overnight: sorry, I'm back to not understanding.
>
> What are these access flags passed into ibv_reg_mr() that are enforced?
> What relation do they bear to what you will pass to __get_user_pages()?
The access flags are:
enum ibv_access_flags {
IBV_ACCESS_LOCAL_WRITE = 1,
IBV_ACCESS_REMOTE_WRITE = (1<<1),
IBV_ACCESS_REMOTE_READ = (1<<2),
IBV_ACCESS_REMOTE_ATOMIC = (1<<3),
IBV_ACCESS_MW_BIND = (1<<4)
};
pretty much the only one of interest is IBV_ACCESS_REMOTE_READ --
all the others imply the possibility of RDMA HW writing to the page.
So basically if any flags other than IBV_ACCESS_REMOTE_READ are
set, we pass FOLL_WRITE to __get_user_pages(), otherwise we pass
the new FOLL_FOLLOW. [does "Marcia, Marcia, Marcia" mean anything
to a Brit? ;)]
ie the change from the status quo would be:
[read-only] write=1, force=1 --> FOLL_FOLLOW
[writeable] wrote=1, force=0 --> FOLL_WRITE (equivalent)
> You are asking for a FOLL_FOLLOW ("follow permissions of the vma") flag,
> which automatically works for read-write access to a VM_READ|VM_WRITE vma,
> but read-only access to a VM_READ-only vma, without you having to know
> which permission applies to which range of memory in the area specified.
> But you don't need that new flag to set up read-only access, and if you
> use that new flag to set up read-write access to an area which happens to
> contain VM_READ-only ranges, you have set it up to write into ZERO_PAGEs.
First of all, I kind of like FOLL_FOLLOW as the name :)
Now you're confusing me: I think we do need FOLL_FOLLOW to
set up read-only access -- we want to trigger the COWs that userspace
might trigger by touching the memory up front. This is to handle
a case like
[userspace]
int *buf = malloc(16 * 4096);
// buf now points to 16 anonymous zero_pages
mr = ibv_reg_mr(pd, buf, 16 * 4096, IBV_ACCESS_REMOTE_READ);
// RDMA HW will only ever read buf, but...
buf[0] = 2012;
// COW triggered, first page of buf changed, RDMA HW has wrong mapping!
For something the RDMA HW might write to, then I agree we don't want
FOLL_FOLLOW -- we just would use FOLL_WRITE as we currently do.
When I get around to coding this up, I think I'm going to spend a lot
of time on the comments and on the commit log :)
- R.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists