[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150425114633.GI5561@linux.vnet.ibm.com>
Date: Sat, 25 Apr 2015 04:46:33 -0700
From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To: Christoph Lameter <cl@...ux.com>
Cc: Jerome Glisse <j.glisse@...il.com>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
jglisse@...hat.com, mgorman@...e.de, aarcange@...hat.com,
riel@...hat.com, airlied@...hat.com,
aneesh.kumar@...ux.vnet.ibm.com,
Cameron Buschardt <cabuschardt@...dia.com>,
Mark Hairgrove <mhairgrove@...dia.com>,
Geoffrey Gerfin <ggerfin@...dia.com>,
John McKenna <jmckenna@...dia.com>, akpm@...ux-foundation.org
Subject: Re: Interacting with coherent memory on external devices
On Fri, Apr 24, 2015 at 03:00:18PM -0500, Christoph Lameter wrote:
> On Fri, 24 Apr 2015, Jerome Glisse wrote:
>
> > > Still no answer as to why is that not possible with the current scheme?
> > > You keep on talking about pointers and I keep on responding that this is a
> > > matter of making the address space compatible on both sides.
> >
> > So if do that in a naive way, how can we migrate a chunk of memory to video
> > memory while still handling properly the case where CPU try to access that
> > same memory while it is migrated to the GPU memory.
>
> Well that the same issue that the migration code is handling which I
> submitted a long time ago to the kernel.
Would you have a URL or other pointer to this code?
> > Without modifying a single line of mm code, the only way to do this is to
> > either unmap from the cpu page table the range being migrated or to mprotect
> > it in some way. In both case the cpu access will trigger some kind of fault.
>
> Yes that is how Linux migration works. If you can fix that then how about
> improving page migration in Linux between NUMA nodes first?
In principle, that also would be a good thing. But why do that first?
> > This is not the behavior we want. What we want is same address space while
> > being able to migrate system memory to device memory (who make that decision
> > should not be part of that discussion) while still gracefully handling any
> > CPU access.
>
> Well then there could be a situation where you have concurrent write
> access. How do you reconcile that then? Somehow you need to stall one or
> the other until the transaction is complete.
Or have store buffers on one or both sides.
> > This means if CPU access it we want to migrate memory back to system memory.
> > To achieve this there is no way around adding couple of if inside the mm
> > page fault code path. Now do you want each driver to add its own if branch
> > or do you want a common infrastructure to do just that ?
>
> If you can improve the page migration in general then we certainly would
> love that. Having faultless migration is certain a good thing for a lot of
> functionality that depends on page migration.
We do have to start somewhere, though. If we insist on perfection for
all situations before we agree to make a change, we won't be making very
many changes, now will we?
> > As i keep saying the solution you propose is what we have today, today we
> > have fake share address space through the trick of remapping system memory
> > at same address inside the GPU address space and also enforcing the use of
> > a special memory allocator that goes behind the back of mm code.
>
> Hmmm... I'd like to know more details about that.
As I understand it, the trick (if you can call it that) is having the
device have the same memory-mapping capabilities as the CPUs.
> > As you pointed out, not using GPU memory is a waste and we want to be able
> > to use it. Now Paul have more sofisticated hardware that offer oportunities
> > to do thing in a more transparent and efficient way.
>
> Does this also work between NUMA nodes in a Power8 system?
Heh! At the rate we are going with this discussion, Power8 will be
obsolete before we have this in. ;-)
Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists