[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.00.1506261827090.20890@mdh-linux64-2.nvidia.com>
Date: Fri, 26 Jun 2015 18:34:16 -0700
From: Mark Hairgrove <mhairgrove@...dia.com>
To: Jerome Glisse <j.glisse@...il.com>
CC: "akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
"joro@...tes.org" <joro@...tes.org>, Mel Gorman <mgorman@...e.de>,
"H. Peter Anvin" <hpa@...or.com>,
Peter Zijlstra <peterz@...radead.org>,
Andrea Arcangeli <aarcange@...hat.com>,
Johannes Weiner <jweiner@...hat.com>,
Larry Woodman <lwoodman@...hat.com>,
Rik van Riel <riel@...hat.com>,
Dave Airlie <airlied@...hat.com>,
Brendan Conoboy <blc@...hat.com>,
Joe Donohue <jdonohue@...hat.com>,
Duncan Poole <dpoole@...dia.com>,
Sherry Cheung <SCheung@...dia.com>,
Subhash Gutti <sgutti@...dia.com>,
John Hubbard <jhubbard@...dia.com>,
Lucien Dunning <ldunning@...dia.com>,
Cameron Buschardt <cabuschardt@...dia.com>,
Arvind Gopalakrishnan <arvindg@...dia.com>,
Haggai Eran <haggaie@...lanox.com>,
Shachar Raindel <raindel@...lanox.com>,
Liran Liss <liranl@...lanox.com>,
Roland Dreier <roland@...estorage.com>,
Ben Sander <ben.sander@....com>,
Greg Stoner <Greg.Stoner@....com>,
John Bridgman <John.Bridgman@....com>,
Michael Mantor <Michael.Mantor@....com>,
Paul Blinzer <Paul.Blinzer@....com>,
Laurent Morichetti <Laurent.Morichetti@....com>,
Alexander Deucher <Alexander.Deucher@....com>,
Oded Gabbay <Oded.Gabbay@....com>,
Jérôme Glisse <jglisse@...hat.com>,
Jatin Kumar <jakumar@...dia.com>
Subject: Re: [PATCH 06/36] HMM: add HMM page table v2.
On Fri, 26 Jun 2015, Jerome Glisse wrote:
> On Thu, Jun 25, 2015 at 03:57:29PM -0700, Mark Hairgrove wrote:
> > On Thu, 21 May 2015, j.glisse@...il.com wrote:
> > > From: Jérôme Glisse <jglisse@...hat.com>
> > > [...]
> > > +
> > > +void hmm_pt_iter_init(struct hmm_pt_iter *iter);
> > > +void hmm_pt_iter_fini(struct hmm_pt_iter *iter, struct hmm_pt *pt);
> > > +unsigned long hmm_pt_iter_next(struct hmm_pt_iter *iter,
> > > + struct hmm_pt *pt,
> > > + unsigned long addr,
> > > + unsigned long end);
> > > +dma_addr_t *hmm_pt_iter_update(struct hmm_pt_iter *iter,
> > > + struct hmm_pt *pt,
> > > + unsigned long addr);
> > > +dma_addr_t *hmm_pt_iter_fault(struct hmm_pt_iter *iter,
> > > + struct hmm_pt *pt,
> > > + unsigned long addr);
> >
> > I've got a few more thoughts on hmm_pt_iter after looking at some of the
> > later patches. I think I've convinced myself that this patch functionally
> > works as-is, but I've got some suggestions and questions about the design.
> >
> > Right now there are these three major functions:
> >
> > 1) hmm_pt_iter_update(addr)
> > - Returns the hmm_pte * for addr, or NULL if none exists.
> >
> > 2) hmm_pt_iter_fault(addr)
> > - Returns the hmm_pte * for addr, allocating a new one if none exists.
> >
> > 3) hmm_pt_iter_next(addr, end)
> > - Returns the next possibly-valid address. The caller must use
> > hmm_pt_iter_update to check if there really is an hmm_pte there.
> >
> > In my view, there are two sources of confusion here:
> > - Naming. "update" shares a name with the HMM mirror callback, and it also
> > implies that the page tables are "updated" as a result of the call.
> > "fault" likewise implies that the function handles a fault in some way.
> > Neither of these implications are true.
>
> Maybe hmm_pt_iter_walk & hmm_pt_iter_populate are better name ?
hmm_pt_iter_populate sounds good. See below for _walk.
>
>
> > - hmm_pt_iter_next and hmm_pt_iter_update have some overlapping
> > functionality when compared to traditional iterators, requiring the
> > callers to all do this sort of thing:
> >
> > hmm_pte = hmm_pt_iter_update(&iter, &mirror->pt, addr);
> > if (!hmm_pte) {
> > addr = hmm_pt_iter_next(&iter, &mirror->pt,
> > addr, event->end);
> > continue;
> > }
> >
> > Wouldn't it be more efficient and simpler to have _next do all the
> > iteration internally so it always returns the next valid entry? Then you
> > could combine _update and _next into a single function, something along
> > these lines (which also addresses the naming concern):
> >
> > void hmm_pt_iter_init(iter, pt, start, end);
> > unsigned long hmm_pt_iter_next(iter, hmm_pte *);
> > unsigned long hmm_pt_iter_next_alloc(iter, hmm_pte *);
> >
> > hmm_pt_iter_next would return the address and ptep of the next valid
> > entry, taking the place of the existing _update and _next functions.
> > hmm_pt_iter_next_alloc takes the place of _fault.
> >
> > Also, since the _next functions don't take in an address, the iterator
> > doesn't have to handle the input addr being different from iter->cur.
>
> It would still need to do the same kind of test, this test is really to
> know when you switch from one directory to the next and to drop and take
> reference accordingly.
But all of the directory references are already hidden entirely in the
iterator _update function. The caller only has to worry about taking
references on the bottom level, so I don't understand why the iterator
needs to return to the caller when it hits the end of a directory. Or for
that matter, why it returns every possible index within a directory to the
caller whether that index is valid or not.
If _next only returned to the caller when it hit a valid hmm_pte (or end),
then only one function would be needed (_next) instead of two
(_update/_walk and _next).
>
>
> > The logical extent of this is a callback approach like mm_walk. That would
> > be nice because the caller wouldn't have to worry about making the _init
> > and _fini calls. I assume you didn't go with this approach because
> > sometimes you need to iterate over hmm_pt while doing an mm_walk itself,
> > and you didn't want the overhead of nesting those?
>
> Correct i do not want to do a hmm_pt_walk inside a mm_walk, that sounded and
> looked bad in my mind. That being said i could add a hmm_pt_walk like mm_walk
> for device driver and simply have it using the hmm_pt_iter internally.
I agree that nesting walks feels bad. If we can get the hmm_pt_iter API
simple enough, I don't think an hmm_pt_walk callback approach is
necessary.
>
>
> > Finally, another minor thing I just noticed: shouldn't hmm_pt.h include
> > <linux/bitops.h> since it uses all of the clear/set/test bit APIs?
>
> Good catch, i forgot that.
>
> Cheers,
> Jérôme
>
Powered by blists - more mailing lists