[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230328155017.5636393b@meshulam.tesarici.cz>
Date: Tue, 28 Mar 2023 15:50:17 +0200
From: Petr Tesařík <petr@...arici.cz>
To: "Michael Kelley (LINUX)" <mikelley@...rosoft.com>
Cc: Christoph Hellwig <hch@...radead.org>, "hch@....de" <hch@....de>,
"m.szyprowski@...sung.com" <m.szyprowski@...sung.com>,
"robin.murphy@....com" <robin.murphy@....com>,
Dexuan Cui <decui@...rosoft.com>,
Tianyu Lan <Tianyu.Lan@...rosoft.com>,
"iommu@...ts.linux.dev" <iommu@...ts.linux.dev>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2 1/1] swiotlb: Track and report io_tlb_used high water
mark in debugfs
On Tue, 28 Mar 2023 13:12:13 +0000
"Michael Kelley (LINUX)" <mikelley@...rosoft.com> wrote:
> From: Christoph Hellwig <hch@...radead.org> Sent: Monday, March 27, 2023 6:34 PM
> >
> > On Sat, Mar 25, 2023 at 10:53:10AM -0700, Michael Kelley wrote:
> > > @@ -659,6 +663,14 @@ static int swiotlb_do_find_slots(struct device *dev, int
> > area_index,
> > > area->index = wrap_area_index(mem, index + nslots);
> > > area->used += nslots;
> > > spin_unlock_irqrestore(&area->lock, flags);
> > > +
> > > + new_used = atomic_long_add_return(nslots, &total_used);
> > > + old_hiwater = atomic_long_read(&used_hiwater);
> > > + do {
> > > + if (new_used <= old_hiwater)
> > > + break;
> > > + } while (!atomic_long_try_cmpxchg(&used_hiwater, &old_hiwater, new_used));
> > > +
> > > return slot_index;
> >
> > Hmm, so we're right in the swiotlb hot path here and add two new global
> > atomics?
>
> It's only one global atomic, except when the high water mark needs to be
> bumped. That results in an initial transient of doing the second global
> atomic, but then it won't be done unless there's a spike in usage or the
> high water mark is manually reset to zero. Of course, there's a similar
> global atomic subtract when the slots are released.
>
> Perhaps this accounting should go under #ifdef CONFIG_DEBUGFS? Or
> even add a swiotlb-specific debugfs config option to cover all the swiotlb
> debugfs code. From Petr Tesarik's earlier comments, it sounds like there
> is interest in additional accounting, such as for fragmentation.
For my purposes, it does not have to be 100% accurate. I don't really
mind if it is off by a few slots because of a race window, so we could
(for instance):
- update a local variable and set the atomic after the loop,
- or make it a per-cpu to reduce CPU cache bouncing,
- or just about anything that is less heavy-weight than an atomic
CMPXCHG in the inner loop of a slot search.
Just my two cents,
Petr T
Powered by blists - more mailing lists