[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250617152210.GA1552699@ziepe.ca>
Date: Tue, 17 Jun 2025 12:22:10 -0300
From: Jason Gunthorpe <jgg@...pe.ca>
To: David Hildenbrand <david@...hat.com>
Cc: lizhe.67@...edance.com, alex.williamson@...hat.com,
akpm@...ux-foundation.org, peterx@...hat.com, kvm@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH v4 2/3] gup: introduce unpin_user_folio_dirty_locked()
On Tue, Jun 17, 2025 at 04:04:26PM +0200, David Hildenbrand wrote:
> On 17.06.25 15:58, David Hildenbrand wrote:
> > On 17.06.25 15:45, David Hildenbrand wrote:
> > > On 17.06.25 15:42, Jason Gunthorpe wrote:
> > > > On Tue, Jun 17, 2025 at 12:18:20PM +0800, lizhe.67@...edance.com wrote:
> > > >
> > > > > @@ -360,12 +360,7 @@ void unpin_user_page_range_dirty_lock(struct page *page, unsigned long npages,
> > > > > for (i = 0; i < npages; i += nr) {
> > > > > folio = gup_folio_range_next(page, npages, i, &nr);
> > > > > - if (make_dirty && !folio_test_dirty(folio)) {
> > > > > - folio_lock(folio);
> > > > > - folio_mark_dirty(folio);
> > > > > - folio_unlock(folio);
> > > > > - }
> > > > > - gup_put_folio(folio, nr, FOLL_PIN);
> > > > > + unpin_user_folio_dirty_locked(folio, nr, make_dirty);
> > > > > }
> > > >
> > > > I don't think we should call an exported function here - this is a
> > > > fast path for rdma and iommfd, I don't want to see it degrade to save
> > > > three duplicated lines :\
> > >
> > > Any way to quantify? In theory, the compiler could still optimize this
> > > within the same file, no?
> >
> > Looking at the compiler output, I think the compile is doing exactly that.
> >
> > Unless my obdjump -D -S analysis skills are seriously degraded :)
>
> FWIW, while already looking at this, even before this change, the compiler
> does not inline gup_put_folio() into this function, which is a bit
> unexpected.
Weird, but I would not expect this as a general rule, not sure we
should rely on it.
I would say exported function should not get automatically
inlined. That throws all the kprobes into chaos :\
BTW, why can't the other patches in this series just use
unpin_user_page_range_dirty_lock? The way this stuff is supposed to
work is to combine adjacent physical addresses and then invoke
unpin_user_page_range_dirty_lock() on the start page of the physical
range. This is why we have the gup_folio_range_next() which does the
segmentation in an efficient way.
Combining adjacent physical is basically free math.
Segmenting to folios in the vfio side doesn't make a lot of sense,
IMHO.
Jason
Powered by blists - more mailing lists