linux-kernel - Re: [PATCH] swiotlb: fix the check whether a device has used software IO TLB

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230922153129.69b26975@meshulam.tesarici.cz>
Date:   Fri, 22 Sep 2023 15:31:29 +0200
From:   Petr Tesařík <petr@...arici.cz>
To:     Catalin Marinas <catalin.marinas@....com>
Cc:     Christoph Hellwig <hch@....de>,
        Marek Szyprowski <m.szyprowski@...sung.com>,
        Robin Murphy <robin.murphy@....com>,
        "open list:DMA MAPPING HELPERS" <iommu@...ts.linux.dev>,
        open list <linux-kernel@...r.kernel.org>,
        Roberto Sassu <roberto.sassu@...weicloud.com>,
        Jonathan Corbet <corbet@....net>
Subject: Re: [PATCH] swiotlb: fix the check whether a device has used
 software IO TLB

Hi Catalin,

thanks again for your reply. I'm sorry for being slow. This world of
weakly ordered memory models is complex, and I was too distracted most
of this week, but I hope I have finally wrapped my head around it.

On Mon, 18 Sep 2023 16:45:34 +0100
Catalin Marinas <catalin.marinas@....com> wrote:

> On Sun, Sep 17, 2023 at 11:47:41AM +0200, Petr Tesařík wrote:
> > On Fri, 15 Sep 2023 18:09:28 +0100
> > Catalin Marinas <catalin.marinas@....com> wrote:  
> > > On Fri, Sep 15, 2023 at 11:13:43AM +0200, Petr Tesařík wrote:  
> > > > On Thu, 14 Sep 2023 19:28:01 +0100
> > > > Catalin Marinas <catalin.marinas@....com> wrote:    
> > > > > What do the smp_wmb() barriers in swiotlb_find_slots() and
> > > > > swiotlb_dyn_alloc() order? The latter is even more unclear as it's at
> > > > > the end of the function and the "pairing" comment doesn't help.    
> > > > 
> > > > By the time swiotlb_find_slots() returns a valid slot index, the new
> > > > value of dev->dma_uses_io_tlb must be visible by all CPUs in
> > > > is_swiotlb_buffer(). The index is used to calculate the bounce buffer
> > > > address returned to device drivers. This address may be passed to
> > > > another CPU and used as an argument to is_swiotlb_buffer().    
> > > 
> > > Ah, I remember now. So the smp_wmb() ensures that dma_uses_io_tlb is
> > > seen by other CPUs before the slot address (presumably passed via other
> > > memory write). It may be worth updating the comment in the code (I'm
> > > sure I'll forget it in a month time). The smp_rmb() before READ_ONCE()
> > > in this patch is also needed for the same reasons (ordering after the
> > > read of the address passed to is_swiotlb_buffer()).  
>[...]
> > > BTW, you may want to use WRITE_ONCE() when setting dma_uses_io_tlb (it
> > > also matches the READ_ONCE() in is_swiotlb_buffer()). Or you can use
> > > smp_store_mb() (but check its semantics first).  
> > 
> > I can use WRITE_ONCE(), although I believe it does not make much
> > difference thanks to the barrier provided by smp_wmb().  
> 
> WRITE_ONCE() is about atomicity rather than ordering (and avoiding
> compiler optimisations messing things). While I don't see the compiler
> generating multiple accesses for a boolean write, using these accessors
> also helps tools like kcsan.

While I still believe a simple assignment works just fine here, I agree
that WRITE_ONCE() is better. It can prevent potential bugs if someone
ever turns the boolean into something else.

>[...]
> > Ah... You may have a point after all if this sequence of events is
> > possible:
> > 
> > - CPU 0 writes new value to mem->pools->next in swiotlb_dyn_alloc().
> > 
> > - CPU 1 observes the new value in swiotlb_find_slots(), even though it
> >   is not guaranteed by any barrier, allocates a slot and sets the
> >   dev->dma_uses_io_tlb flag.
> > 
> > - CPU 1 (driver code) writes the returned buffer address into its
> >   private struct. This write is ordered after dev->dma_uses_io_tlb
> >   thanks to the smp_wmb() in swiotlb_find_slots().
> > 
> > - CPU 2 (driver code) reads the buffer address, and DMA core passes it
> >   to is_swiotlb_buffer(), which contains smp_rmb().
> > 
> > - IIUC CPU 2 is guaranteed to observe the new value of
> >   dev->dma_uses_io_tlb, but it may still use the old value of
> >   mem->pools->next, because the write on CPU 0 was not ordered
> >   against anything. The fact that the new value was observed by CPU 1
> >   does not mean that it is also observed by CPU 2.  
> 
> Yes, that's possible. On CPU 1 there is a control dependency between the
> read of mem->pools->next and the write of dev->dma_uses_io_tlb but I
> don't think this is sufficient to claim multi-copy atomicity (if CPU 1
> sees mem->pools->next write by CPU 0, CPU 2 must see it as well), at
> least not on all architectures supported by Linux. memory-barriers.txt
> says that a full barrier on CPU 1 is needed between the read and write,
> i.e. smp_mb() before WRITE_ONCE(dev->dma_uses_io_tlb). You could add it
> just before "goto found" in swiotlb_find_slots() since it's only needed
> on this path.

Let me check my understanding. This smp_mb() is not needed to make sure
that the write to dev->dma_uses_io_tlb cannot be visible before the
read of mem->pools->next. Since stores are not speculated, that
ordering is provided by the control dependency alone.

But a general barrier ensures that a third CPU will observe the write to
mem->pools->next after the read of mem->pools->next. Makes sense.

I think I can send a v2 of my patch now, with abundant comments on the
memory barriers.

> Another thing I noticed - the write in add_mem_pool() to mem->nslabs is
> not ordered with list_add_rcu(). I assume swiotlb_find_slots() doesn't
> need to access it since it just walks the mem->pools list.

That's correct. Writes to mem->nslabs are known to be racy, but it
doesn't matter. This is explained in commit 1aaa736815eb ("swiotlb:
allocate a new memory pool when existing pools are full"):

- swiotlb_tbl_map_single() and is_swiotlb_active() only check for non-zero
  value. This is ensured by the existence of the default memory pool,
  allocated at boot.
    
- The exact value is used only for non-critical purposes (debugfs, kernel
  messages).

Petr T