linux-kernel - RE: [PATCH v2 1/1] swiotlb: Track and report io_tlb

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <BYAPR21MB1688B76012618BEA6B0DE6A5D7889@BYAPR21MB1688.namprd21.prod.outlook.com>
Date:   Tue, 28 Mar 2023 15:17:47 +0000
From:   "Michael Kelley (LINUX)" <mikelley@...rosoft.com>
To:     Petr Tesařík <petr@...arici.cz>
CC:     Christoph Hellwig <hch@...radead.org>, "hch@....de" <hch@....de>,
        "m.szyprowski@...sung.com" <m.szyprowski@...sung.com>,
        "robin.murphy@....com" <robin.murphy@....com>,
        Dexuan Cui <decui@...rosoft.com>,
        Tianyu Lan <Tianyu.Lan@...rosoft.com>,
        "iommu@...ts.linux.dev" <iommu@...ts.linux.dev>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: [PATCH v2 1/1] swiotlb: Track and report io_tlb_used high water
 mark in debugfs

From: Petr Tesařík <petr@...arici.cz> Sent: Tuesday, March 28, 2023 8:08 AM
> 
> On Tue, 28 Mar 2023 14:29:03 +0000
> "Michael Kelley (LINUX)" <mikelley@...rosoft.com> wrote:
> 
> > From: Petr Tesařík <petr@...arici.cz> Sent: Tuesday, March 28, 2023 6:50 AM
> > >
> > > On Tue, 28 Mar 2023 13:12:13 +0000
> > > "Michael Kelley (LINUX)" <mikelley@...rosoft.com> wrote:
> > >
> > > > From: Christoph Hellwig <hch@...radead.org> Sent: Monday, March 27, 2023
> 6:34
> > > PM
> > > > >
> > > > > On Sat, Mar 25, 2023 at 10:53:10AM -0700, Michael Kelley wrote:
> > > > > > @@ -659,6 +663,14 @@ static int swiotlb_do_find_slots(struct device *dev,  int
> > > > > area_index,
> > > > > >   area->index = wrap_area_index(mem, index + nslots);
> > > > > >   area->used += nslots;
> > > > > >   spin_unlock_irqrestore(&area->lock, flags);
> > > > > > +
> > > > > > + new_used = atomic_long_add_return(nslots, &total_used);
> > > > > > + old_hiwater = atomic_long_read(&used_hiwater);
> > > > > > + do {
> > > > > > +         if (new_used <= old_hiwater)
> > > > > > +                 break;
> > > > > > + } while (!atomic_long_try_cmpxchg(&used_hiwater, &old_hiwater, new_used));
> > > > > > +
> > > > > >   return slot_index;
> > > > >
> > > > > Hmm, so we're right in the swiotlb hot path here and add two new global
> > > > > atomics?
> >[...]
> > > For my purposes, it does not have to be 100% accurate. I don't really
> > > mind if it is off by a few slots because of a race window, so we could
> > > (for instance):
> > >
> > > - update a local variable and set the atomic after the loop,
> > > - or make it a per-cpu to reduce CPU cache bouncing,
> > > - or just about anything that is less heavy-weight than an atomic
> > >   CMPXCHG in the inner loop of a slot search.
> > >
> >
> > Perhaps I'm missing your point, but there's no loop here.  The atomic
> > add is done once per successful slot allocation.  If swiotlb_do_find_slots()
> > doesn't find any slots for the current area, it exits at the "not_found" label
> > and the atomic add isn't done.
> 
> My bad. I read the patch too quickly and thought that the update was
> done for each searched area. I stay corrected here.
> 
> >[...]
> > I thought about tracking the high water mark on a per-CPU basis or
> > per-area basis, but I don't think the resulting data is useful.  Adding up
> > the individual high water marks likely significantly over-estimates the
> > true high water mark.   Is there a clever way to make this useful that I'm
> > not thinking about?
> 
> No, not that I'm aware of. Min/max cannot be easily split.
> 
> >[...]
> > Regarding your other email about non-default io_tlb_mem instances,
> > my patch just extends what is already reported in debugfs, which
> > is only for the default io_tlb_mem.   The non-default instances seemed
> > to me to be fairly niche cases that weren't worth the additional
> > complexity, but maybe I'm wrong about that.
> 
> What I mean is that the values currently reported in debugfs only refer
> to io_tlb_default_mem. Since restricted DMA pools also use
> swiotlb_find_slots() and swiotlb_release_slots(), the global counters
> now get updated both for io_tlb_default_mem and all restricted DMA
> pools.
> 
> In short, this hunk is a change in behaviour:
> 
>  static int io_tlb_used_get(void *data, u64 *val)
>  {
> -	*val = mem_used(&io_tlb_default_mem);
> +	*val = (u64)atomic_long_read(&total_used);
>  	return 0;
>  }
> 
> Before the change, it shows the number of used slots in the default
> SWIOTLB, after the change it shows the total number of used slots in
> the SWIOTLB and all restricted DMA pools.
> 

Got it -- I understand your point now. You are right.  I'll fix in the next version.

Michael